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Sex and the law 


A report from South Africa on the science of human sexuality and its implications for policy-making 
brings African countries a step closer to confronting laws that criminalize homosexuality. 


scientific thinking in the service of society.” There are many 

types of scientific thinking, of course, and not all of them serve 
society particularly well. Scientific thinking on homosexuality, for 
instance, has a very chequered past. 

Until the mid-1970s, the American Psychiatric Association listed 
homosexuality in its official manual of mental disorders. Academic 
journals at the time were filled with case reports of psychologists and 
medics trying to turn gay men straight. A new book, ‘Curing Queers’: 
Mental Nurses and Their Patients, 1935-74 by Tommy Dickinson, 
details cases of such ‘aversion therapy’ from the United Kingdom, 
where behavioural psychologists tried to erase homosexual behaviour 
by associating it with unpleasant sensations, including pain. 

Scientific thinking on homosexuality, and other issues of sex, 
sexuality and gender, has moved on considerably since then. Thankfully, 
so too have many societies. Last month, Ireland became the latest 
country to legalize same-sex marriage. Science played no part in that 
decision, and why should it have? 

Unfortunately, not everyone sees it that way. Science — or, more 
accurately, a flawed version of scientific thinking — is still used as a 
cloak for prejudice and persecution of homosexuals in countries across 
Africa and elsewhere. In February last year, for example, the press office 
of the Ugandan presidential State House formally announced that 
President Yoweri Museveni was to sign an “anti-gay bill after experts 
prove there is no connection between biology and being gay”. 

The ‘scientific thinking here (and bear with us) is that, because 
researchers have not found a specific gene that is associated with homo- 
sexuality, science cannot say that some people are born gay. And ifthey 
are not born that way, the elastic logic goes, homosexuality is a lifestyle 
choice. And states are within their rights to criminalize some behaviour. 
“T want a scientific answer; the president said, “nota political answer 

As we report on page 135, a scientific answer on this question is now 
available. The Academy of Science of South Africa has published a 
comprehensive study on the science of human sexuality and the impli- 
cations for policy (see go.nature.com/q3rr4k). The report demolishes 
the political lie that anti-gay laws are supported by scientific evidence. 
And it shows that, contrary to the public-health claims of politicians 
who want to criminalize homosexuality, such laws hamper efforts to 
combat the spread of HIV and other sexually transmitted infections. 

What difference will this report make? It would be naive to expect 
that rational argument — scientific thinking — can draw the poison 
from the venomous attitudes that drive hatred and prejudice. But the 
report, if it is distributed widely, can still act as a useful tool for those 
who have the courage within Africa to oppose unjust laws. 

As the report points out, there is precedent here. South Africa under 
the apartheid regime, and other places, tried to justify laws against 
mixed-race marriages with references to science and public health. 
The ‘natural order’ demanded that everyone stick to their own ethnic 


T= motto of the Academy of Science of South Africa is: “Applying 


and racial groups. Countering such claims alone does not dismantle 
the regime that produces them, but it offers ammunition to undermine 
claims to legitimacy that such regimes may make. Science helped to strip 
away the cloak to reveal the true, ugly motivations for such racial dis- 
crimination (and continues to do so, because the argument that ‘mixed’ 
couples produce more-dysfunctional families than non-mixed ones still 

rears its head from time to time). And it can 


“The study do the same for anti-gay rhetoric too. 

could find no This is not an easy subject for scientists in 
evidence that Africa to cover. The South African academy 
homosexuality is deserves great credit for taking on this topic, 
anything other and for producing such an unvarnished 
thana feature account of the true state of the scientific 
on aspectrum of evidence and what that means for evidence- 


based policy. Credit, too, should go to the 
Uganda National Academy of Sciences, 
which has officially endorsed the findings. 

Museveni has the scientific answer he requested. As a phrase used 
many times in the report reads, the study “could find no evidence that” 
homosexuality is anything other than a feature on a spectrum of human 
sexuality. Indeed, the more that scientific thinking is applied to 
human sex and gender issues, the clearer it becomes that the evidence 
points towards greater diversity as the norm, nota culturally deter- 
mined number of select options. 

Spread the word. Share the report and its findings. Its conclusions, 
to those who respect scientific evidence, may be unremarkable. But 
sometimes stating the obvious again and again until people start to 
listen can be the best way for scientific thinking to serve society. m 


human sexuality.” 


Undue burdens 


Proposed controls on foreign operations in China 
are a threat to scientific collaboration. 


Its anti-corruption drive has government officials and busi- 

nesses in all sectors shaking. The government has tightened its 

grip on the Internet, and the block on accessing Google and Google 

Scholar in China has hamstrung researchers’ ability to keep abreast 

of the latest scientific trends. Some proposed restrictions are so vague 

that they could be applied to almost anything. What do government 

officials mean, for example, when they say that “Western values’ have 
no place in Chinese university textbooks? 

There are many reasons for these moves. President Xi Jinping is still 


( visa seems to be cracking down on everything at the moment. 
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consolidating power, in a system rife with corruption. Meanwhile, an 
increasingly vocal populace complains of rich officials, environmental 
problems and food safety. 

The government wants to stay in charge of efforts to deal with 
problems and maintain its goal of stability. It is not alone in such 
efforts. And it is not alone in setting its sights on what it sees as a pos- 
sible source of dissent and social strife: non-governmental organiza- 
tions (NGOs). Russia and India in recent months have already set 
out worrying plans to stifle such operations. Now China is following. 

In China, domestic NGOs are, for the most part, government- 
organized bodies, and so are still under government influence. But 
foreign NGOs are a concern to the government, and a potential desta- 
bilizing force, especially when they try to spread “Western values. 

Last year, the government surveyed foreign NGOs operating in China 
and counted about 1,000 permanent operations; when short-term pro- 
jects are included, NGOs in China number up to 6,000. The government 
estimates that these groups pour hundreds of millions of dollars into 
some 20 areas, including health, environmental protection and educa- 
tion. To Chinese officials, these are alarmingly high numbers. 

China feels that its grip on these organizations has been too loose. 
Accordingly, over the past month it has sought comments on a new 
draft law — the Non-Mainland Non-Governmental Organizations 
Management Law — that will tighten restrictions on NGOs. 

The move may not be a surprise, given the political mood. But the 
proposed scope of the law is broader than many people expected, and 
is causing alarm. Its definition of an NGO is so broad — all activities 
of “not-for-profit, non-governmental social organizations” — that, 
according to Jia Xijin, a specialist on NGOs in China at Tsinghua Uni- 
versity, it covers all organized activities between Chinese nationals and 
foreigners. Many people, citizens and visitors alike, probably have no 
idea that the law will apply to them. 

The new rules would require individuals or institutions wishing 
to carry out activities in China to get a sponsor, such as a ministry or 
other agency of local government. Then they must apply for permis- 
sion — not to the civil-affairs ministry, as in the existing system, but 
to the public-security bureau. 

What will happen when the public-security bureaus, which are 
accustomed to operating with a police mindset, start sizing up applica- 
tions for scientific collaborations? At the very least, the result would be 
undue, and potentially forbidding, restrictions and red tape. It would 


probably, for instance, discourage studies of environmental problems 
that regional governments are not ready to admit to. At the very worst, 
it would allow the persecution of institutions or of individuals from 

blacklisted institutions. 
Could a political demonstration at a university overseas mean 
that researchers from that university would no longer be welcome 
in China? What if an individual had some 


“Overseas other political connection that made officials 
institutions uncomfortable? 

have already The proposed law is not explicit in how it 
expressed should be applied to specific collaborations 
concern.” or specific research projects. Those associ- 


ated with universities or scientific societies 
in China fear that the decisions will be deferred to officials with little 
experience of science. What happens when these officials come across 
a project they do not fully understand? Will they want to take a chance 
on it? Most probably, observers fear, they would rather just reject it — 
or, more likely, sit on it — and make things easy for themselves. 

Overseas institutions have already expressed concern. One response 
to the Chinese consultation came from Harvard University in Cam- 
bridge, Massachusetts, which said that “universities should not be 
treated as non-governmental organizations and should not be sub- 
ject to its provisions which, if implemented, could inadvertently make 
future transnational faculty and student collaborations more difficult, 
and therefore less frequent”. 

In an e-mail to Nature, a Harvard spokesperson put it diplomati- 
cally: “We would have concern with any law that might inhibit the 
future ability of faculty and students to work together on common 
areas of interest by creating new, undue burdens.” 

Jia says that second- or third-tier universities, and especially local or 
private universities, are likely to suffer. Whereas prominent government- 
affiliated universities such as Tsinghua or Peking universities, both in 
Beijing, would probably be accredited as authorized hosts for foreign 
NGOs that want to carry out temporary activities in China, smaller 
and less-well-connected universities are unlikely to get such approval. 

Both science and China have benefited from the emphasis that the 
Chinese government has placed in recent years on research as a driver 
of growth and development. International links are a key component 
of that. To weaken such networks could do more than cut useful ties. 
It could undermine the stability that they help to bring. m 


Tough targets 


Concrete goals set out by the G7 nations lay the 
groundwork for a climate accord. 


called for global greenhouse-gas emissions to be reduced 

by around 70% by 2050, and for the world economy to be 
decarbonized by the end of the twenty-first century. These twin goals 
were issued in a communiqué at the conclusion of the group’s meet- 
ing at Schloss Elmau in Kriin, Germany, on 8 June, alongside a suite 
of promises to help developing nations to provide their citizens with 
clean energy, jobs, financial security and food. 

To the credit of German Chancellor Angela Merkel, leader of the host 
nation, the commitments surpass all of the G7’s previous promises. Most 
notably, the group has formally acknowledged — and quantified — the 
scale of the industrial renaissance that will be required to keep global 
average temperature increase to less than 2°C above pre-industrial lev- 
els. It has provided concrete and measurable targets that should help 
to make clear where precious capital and human resources should be 
invested — not just for other governments, but also for businesses. It 


Te Group of 7 (G7) leading industrialized nations this week 
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should also make clear where resources should not be expended. The 
G7 nations renewed their pledge to end “inefficient” fossil-fuel subsidies. 

The nations also reaffirmed a commitment, made in Copenhagen in 
2009, to increase climate aid for developing countries to US$100 billion 
per year by 2020, including both public and private financing. The com- 
muniqué calls for an expansion of renewable energy in developing coun- 
tries, and further work to help the most vulnerable countries to prepare 
for climate change. In particular, the G7 pledged to ensure that 400 mil- 
lion people in developing nations have access to climate-risk insurance, 
to mitigate the effects of disasters such as droughts and storms. 

The timing is good. Nations are wrapping up the latest round of 
climate talks in Bonn this week, with the aim of advancing a climate 
agreement to be signed in Paris later this year. Policy-makers have 
their work cut out if they are to sign a meaningful accord, and the G7 
meeting represents a small step in the right direction. 

But the world is still waiting for action that will give these targets 
credibility. Countries should adopt the G7 communiqué’s emissions 
targets and look for ways to expand climate-related investment in the 
developing world, where emissions are poised to rise quickly if no 
intervention is made. The communiqué rightly 
points out that engagement by the private sector 
will be crucial to meeting these goals, but it is 
up to policy-makers to lay down the rules of the 
road. = 
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genomics, epidemiology and population-level health being 
collected by researchers? Maximizing the benefits depends 
on how well we as a scientific community share information. 

The Human Genome Project set strong precedents for rapid pre- 
publication data sharing, and all biological research has benefited 
enormously from this approach. Most research-funding agencies, and 
most scientists, now agree that research data should be shared — pro- 
vided that those who donate their data and samples are protected. This 
approach is strongly advocated by organizations such as the Global 
Alliance for Genomics and Health. But data sharing will work well only 
when it is streamlined, efficient and fair. How can more scientists be 
encouraged and helped to make their data available, without adding 
an undue administrative burden? 

I chair an expert advisory group on data access 
that has examined this question. As part of our 
work, we surveyed current practices and ques- 
tioned Nature readers. We saw plenty of good 
practice — in the UK social-sciences community, 
for example — but also significant inefficiencies. 
Both those who generate data and those who 
want to use them expressed frustration at the way 
that data-access processes are frequently opaque. 

At present, mechanisms for data sharing are too 
often an afterthought. Access protocols are set up 
and managed differently from study to study, and 
this adds to the administrative burden for both 
producers and users. No one wins in this scenario, 
least of all those who donate their personal data. 

Today, we publish our recommendations (see 
www.wellcome.ac.uk/EAGDA). They are aimed at research funders, 
who are best placed to implement them. But we hope that all research- 
ers will find them useful. A key recommendation is that data-access 
plans should be integral to the grant-application process. Researchers 
should set out what they regard as a reasonable process for governing 
and managing access, including estimates of the costs of making the data 
visible and available to other researchers. The review process should 
advise on this and the data-access plan should be an integral, auditable 
part of the funded grant. 

Generally speaking, bigger studies will need more-substantial 
processes. Small experimental studies may reasonably do no more 
than make their data available on request, after the time to prepare for 
publication. Very large studies require a more formal data-access plan 
from their inception. 

Many epidemiological or genomic studies 


| | ow can we make best use of the vast amounts of data on 
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Funders must encourage 
scientists to share 


To realize the full potential of large data sets, researchers must agree on better 
ways to pass data around, says Martin Bobrow. 


and the undertakings they ask of potential data users, are usually similar 
across studies. Where possible, funders should encourage the stream- 
lining and standardization of this process, while allowing for the fact 
that studies have their own characteristics. It would be helpful, where 
possible, to introduce common application forms and adjudication 
processes, and to allow new studies to make use of or consolidate with 
existing DACs. Access procedures should be made more transparent 
and straightforward by including an independent appeals process for 
settling disputes over access requests. 

Protecting research participants is sometimes cited as a reason for 
withholding data. The risk that research participants could be re-iden- 
tified from shared data must be carefully assessed, particularly when 
data sets are linked in novel ways. But safeguarding participants identity 
should not require a complex or opaque system of 
data access, as often seems to be the case. 

It is easier to protect subjects if researchers 
build data access into their studies from the 
beginning. Participant consent forms, for exam- 
ple, should be designed with data sharing in 
mind — granting permission for de-identified 
personal information to be shared safely with 
researchers outside the study group. 

It is reasonable for scientists to impose certain 
conditions or restrictions on the use of their hard- 
earned data sets, but these should be proportion- 
ate and kept to a minimum. Justifiable conditions 
can range from requiring secondary users to 
acknowledge the source of the data in publica- 
tions, to stipulating a fair embargo time on the 
use of new data releases. Whatever the conditions 
imposed, they need to be presented clearly to data users. 

Criteria used to judge academic careers still focus heavily on 
individual publication records and provide little incentive for wider 
data sharing. Scientists who let others use their data deserve reward too. 

To build trust, any significant breaches of data- and material- 
transfer agreements should be treated seriously, with appropriate 
sanctions being imposed, such as prevention of future access to data 
sets, or forcing the withdrawal of a published paper. 

Funders should expect that each data set they support will be made 
accessible unless there are particular, agreed reasons for it not to be. 
Science is increasingly a joint, international and collaborative enter- 
prise. The emphasis now must be on encouraging scientists, with 
support and resources from funders, to voluntarily make their data 
more readily available to others. m 


Martin Bobrow is emeritus professor of medical genetics at the 
University of Cambridge, UK, and chair of the Expert Advisory Group 
on Data Access. 

e-mail: mb238@cam.ac.uk 
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RESEARCH HIGHLIGHTS 


Energy stored 
inside an aerogel 


Researchers have created a 
promising 3D energy-storage 
device using a porous aerogel. 
These ‘supercapacitors’ could 
offer much higher power 
densities than conventional 
structures. 

Mahiar Hamediat the 
KTH Royal Institute of 
Technology in Stockholm 
and his colleagues coated the 
foamy interior of an aerogel 
with carbon nanotubes to 
create an electrode. They 
covered this with an insulating 
plastic, followed by another 
nanotube electrode layer. 
This formed a supercapacitor 
that showed stable charging 
and discharging over 
400 cycles, and maintained its 
performance when the aerogel 
was compressed by up to 75%. 

Aerogels have the largest 
internal surface area of any 
synthetic material, so such 
components could store large 
amounts of power in a range of 
electronic devices. 
Nature Commun. 6, 7259 (2015) 


Why human eggs 
are error-prone 


The cellular machinery for 
separating chromosomes is 
unusually unstable in human 
eggs. This makes the eggs 
prone to having abnormal 
numbers of chromosomes, 
which can result in pregnancy 
loss and genetic disorders. 
When cells divide to make 
eggs or sperm, chromosome 
pairs separate owing to 
spindle-shaped cellular 
machinery. Melina Schuh 
at the MRC Laboratory 
of Molecular Biology in 
Cambridge, UK, and her 
colleagues observed this 
process in more than 
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The cost of native and GM cotton crops 


Native cotton in India can generate similar 
profits to genetically modified (GM) cotton 
when both are grown without irrigation. 

Carla Romeu-Dalmau, Liam Dolan and 
their colleagues at the University of Oxford, 
UK, compared the economic impact of 
growing native Asiatic cotton (Gossypium 
arboreum L.) with that of growing American 
Bt cotton (Bt Gossypium hirsutum), which has 
been engineered to contain bacterial genes 
that make the crop resistant to insect pests. 
They found that farmers in the Indian state of 
Maharashtra spent more money to produce 
Bt cotton than native cotton, even though 


Bt cotton generates higher yields. 

The authors also looked at farming Bt cotton 
under different conditions, and found 
that the GM cotton grown under rain-fed 
conditions has similar economic benefits to the 
same cotton grown using irrigation. Although 
Bt cotton gives higher yields with irrigation 
than without, growing it under these conditions 
costs more and eats into profits. 

Farmers should bear in mind a range 
of factors, including expenses and water 
availability, when choosing which crop to plant, 
the authors suggest. 
Nature Plants 1, 15072 (2015) 


100 live egg cells from New South Wales in Sydney, 
women undergoing fertilit 7 Australia, analysed high- 
treatments. They found di Hot storms bri ng resolution rainfall da from 
the chromosome segregation bi g rainfall swin gs 79 locations across Australia 
period was unusually long, from 1955 to 2005. They found 
lasting about 16 hours. In As temperatures rise, heavy that, at all latitudes, Australian 
many egg cells, the spindles rainfall during storms rainfall patterns became less 
were unstable, causing the becomes even heavier, uniform as temperatures rose, 


chromosomes to lag behind 
during separation, and 
increasing the risk that they 
would not reach the correct 
side of the spindle before the 
cells divided. 

Science 348, 1143-1147 (2015) 
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whereas lighter bursts grow 
less intense. This could 
bring storms that are more 
unpredictable and destructive 
as the climate warms. 

Conrad Wasko and Ashish 
Sharma at the University of 
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and the authors predicted a 
5-20% increase in the peak 
water flow rate during floods 
at temperatures 5 °C warmer 
than today. 

A warmer climate could 
lead to short-term floods that 
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are more intense, the authors 
suggest. 

Nature Geosci. http://dx.doi. 
org/10.1038/nge02456 (2015) 


Lazy male birds 
pay ahigh price 


Male songbirds that sleep late 
risk having their female 
partners mate with another 
male. 

Mating outside ofa 
monogamous pair in birds 
normally happens early in the 
morning. To find out if rising 
earlier or later would affect 
reproductive patterns of great 
tits (Parus major), Timothy 
Greives of North Dakota 
State University in Fargo 
and his co-workers captured 
male birds in Germany and 
implanted them with a device 
that releases melatonin. 

This hormone is generated 
mostly at night to set the 
circadian clock. Male tits that 
had night-time-like levels of 
melatonin around the clock 
began their daily activities on 
average 10 minutes later than 
the control group. Their nests 
also contained more offspring 
fathered by another male, 
suggesting that the late-rising 
males were less able to defend 
their mates. 

The results demonstrate 
how sexual selection affects 
circadian rhythms in the wild. 
Funct. Ecol. http://doi.org/44c 
(2015) 


EVOLUTIONARY BIOLOGY 


Galapagos iguanas 
share genes 


Swimming lizards on one 
of the Galapagos Islands are 
evolving into new species, 
but they also seem to be 
mating with lizards from 
neighbouring islands 
— possibly helping to 
incorporate adaptations 
from other populations 
into their gene pool. 
Sebastian Steinfartz at 
the Technical University 
of Braunschweig 
in Germany and 
his colleagues 


analysed the genomes of more 
than 500 Galapagos marine 
iguanas (Amblyrhynchus 
cristatus; pictured) from 

the island of San Cristébal 

in the Galapagos. They 

found evidence of ongoing 
hybridization between lineages 
from different islands, along 
with speciation in the San 
Cristébal population. 

This simultaneous 
hybridization and speciation 
could have contributed to the 
evolutionary success of the 
marine iguana, the authors say. 
Proc. R. Soc. B 282, 20150425 
(2015) 


Tiny robot fuelled 
by light 


A microscopic ‘walker’ just a 
few tens of micrometres in size 
can shuffle, rotate and even 
jump, powered only by light. 

Hao Zeng and Diederik 
Wiersma at the University 
of Florence in Italy and their 
co-workers created their 
device using materials called 
liquid crystalline elastomers, 
which contract and expand 
like muscles. They added a 
light-sensitive dye, attached 
four cone-shaped legs made 
from acrylic resin and focused 
a laser beam on the robot. The 
device walked in a straight line 
ona patterned surface and even 
jumped up to 100 times its own 
body length. 

Such a robot could be 
powered by ambient light 
alone, and could be modified to 
perform other actions such as 
swimming, the authors say. 
Adv. Mater. http://doi.org/f2747b 
(2015) 


ASTRONOMY 


Megaflare seen 
on star surface 


Astronomers have spotted 


>, an enormous surge of light 


and magnetic energy ona 
nearby star. 

A team led by 
Wouter Vilemmings 
at Chalmers 
University of 
Technology 
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Popular topics 
on social media 


SOCIAL SELECTIO 


Unpaid research jobs draw criticism 


Volunteer jobs are a rite of passage for many budding 
ecologists and wildlife biologists, but a website highlighting 
these unpaid positions calls them “unprofessional” and 
“exploitative”. Alex Bond, a conservation biologist at the 
RSPB Centre for Conservation Science in Sandy, UK, created 
the Tumblr page “Crap Wildlife “Jobs” on 31 May (http:// 
crapwildlifevolunteerjobs.tumblr.com), and it already has 
supporters on Twitter. “Really cool (and necessary) initiative,” 
tweeted Julie Godbout, an environmental geneticist at Laval 
University in Quebec City, Canada. “Do what you love 
AND get paid for it? But Stephanie Stack, an environmental 
scientist with the Pacific Whale Foundation in Wailuku, 
Hawaii, which is featured on the page, 


> NATURE.COM says that unpaid internships give young 
Formore on scientists a chance to gain valuable 
popular papers: experience and to make connections in 
go.nature.com/pfew9n the field. 


near Gothenburg, Sweden, 
pointed the ALMA radio 
telescope in northern Chile 
at the red giant Mira A, a star 
92 parsecs (300 light years) 
away that was once like our 
Sun but is now bloated in old 
age. ALMA'ss high resolving 
power was able to pick out 
features on the stellar surface 
—a feat unprecedented at 
these wavelengths. The data 
revealed a bright hotspot on 
Mira’s surface that is roughly 
the same size as Mercury's 
orbit around the Sun. 

The star is probably 
unleashing energy from its 
magnetic field, similar to what 
happens on the Sun, suggesting 


studied the cognition of 
chimps (pictured) by 
presenting them with a 
specially designed cooking 
device and raw and cooked 
foods such as carrots and 
potatoes. They confirmed 
that the apes prefer cooked 


that magnetic fields have a role 
even when these stars grow old. 
Astron. Astrophys. 577,L4 (2015) 


Chimps’ mental 
capacity to cook 


Chimpanzees have key 
cognitive abilities for cooking 
food — a hint that humans 
might have developed the 
capacity for cooking early in 
evolution. 

Felix Warneken at Harvard 
University in Cambridge, 
Massachusetts, and Alexandra 
Rosati at Yale University in 
New Haven, Connecticut, 


to raw items, and found that 
chimps are willing to wait 
longer for cooked food than 
for raw food. The animals 
were able to give up their own 
raw food to cook it, and to 
save it for later cooking. 

The results suggest that the 
last common ancestor of apes 
and humans had the cognitive 
abilities to cook food, long 
before humans learned to 
control fire. 

Proc. R. Soc. B 282, 20150229 
(2015) 
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EVENTS 


MERS outbreak 

A large outbreak of Middle East 
respiratory syndrome (MERS) 
coronavirus in South Korea 
has caused alarm this week. 

As Nature went to press, the 
hospital-acquired virus had 
killed 7 people and infected 

94 in the country — the largest 
outbreak outside the Middle 
East so far. The latest cases 
began when a South Korean 
man returned to Seoul from 
the Middle East, and visited. 
four health-care facilities before 
being diagnosed. There is also 
one case in China, imported 
from South Korea. The virus 
could become a greater threat 
ifit acquires mutations that 
allow it to spread between 
humans more easily; however, 
the South Korean health 
ministry announced on 6 June 
that sequencing suggests 

the virus is unchanged. See 
page 139 for more. 


Solar-plane setback 


Solar Impulse, the plane that 
aims to be the first solar- 
powered craft to circle the 
world, landed unexpectedly 
in Nagoya, Japan, on 1 June as 
it attempted to begin its long 
trans-Pacific journey. Weather 
patterns forced the re-routing. 
On 2 June, wind gusts whipped 
through the airport at Nagoya 
and damaged part of Solar 
Impulse’s wing, delaying the 


NUMBER CRUNCH 


$400 m 


The largest-ever gift donated 
to Harvard University in 
Cambridge, Massachusetts, 
given on 3 June to its School 
of Engineering and Applied 
Sciences by alumnus and 
hedge-fund manager 

John Paulson. 


Hopes high as LHC switches on again 


Scientists at CERN, Europe's particle-physics 
laboratory near Geneva, Switzerland, celebrated 
the official restart of the Large Hadron Collider 
(LHC) on 3 June. The machine was shut down 
for two years to undergo upgrades and can now 
smash protons together at a record energy of 13 


project for at least another 
week. The planned round-the- 
world flight began in March 
from Abu Dhabi. 


Solar sail unfurls 
The world’s first privately 
funded solar sail unfurled 

in orbit on 7 June. The test 
flight of LightSail, run by the 
Planetary Society, a space- 
advocacy group in Pasadena, 
California, had run into a 
series of communications 
and battery problems, but 
unexpectedly responded to 
mission control after a three- 
day silence and deployed its 
32-square-metre sail. Solar 
sails use the pressure from 
light radiated by the Sun to 
move through space. 


Space partnership 
A near-Earth space-weather 
observatory has been 
nominated to become the first 
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official joint mission between 
the European Space Agency 
and the Chinese Academy 

of Sciences. The Solar wind 
Magnetosphere Ionosphere 
Link Explorer (SMILE) would 
study the interaction between 
Earth’s magnetic field and the 
solar wind, charged particles 
that stream from the Sun. It is 
scheduled for launch in 2021. 
Selected on 4 June from a 

pool of 13 proposals, SMILE 

is expected to cost about 

€100 million (US$112 million), 
and will undergo further 
assessment ahead of a formal 
selection process later this year. 


Biochemist dies 


Irwin Rose, an American 
biochemist who co-discovered 
a cellular recycling system, 
died on 2 June aged 88. Rose 
shared the 2004 Nobel Prize in 
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teraelectronvolts. Whereas the LHC’s first run 
hunted down the Higgs boson, the second will 
do a wide-ranging search for discrepancies with 
the standard model, physicists’ best description 
of particle and force behaviour. See go.nature. 
com/sglyfm for more. 


Chemistry for work showing 
that proteins marked with a 
molecular tag called ubiquitin 
are destined for destruction. 
This process malfunctions in 
diseases such as cystic fibrosis 
and cancer, and Rose’s work led 
to the development ofa drug to 
treat certain blood cancers. 


FACILITIES 


Mammoth telescope 
The Giant Magellan Telescope 
has been given the go-ahead 
to start construction at Las 
Campanas Observatory 

in Chile, project leaders 
announced on 3 June. Its 
ultimate design calls for 

seven mirrors that together 
will span 25 metres. When 

it comes online in 2021, it 

will have only four of those 
seven in place, but it will 

still be the world’s largest 
optical telescope at that point. 


M. BRICE/CERN 
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SOURCE: T. VOS ETAL. LANCET HTTP://DOI.LORG/45V (2015) 


Eleven international partners 
have committed more than 
US$500 million to the project, 
just over half of its total 
expected cost. 


| RESEARCH 
Fracking impacts 


Hydraulic-fracturing activities 
in oil and gas development 
have not had major effects 
on drinking-water resources 
in the United States, the US 
Environmental Protection 
Agency reported in a draft 
assessment on 4 June. The 
much-anticipated study did 
document multiple threats 
to surface and groundwater 
resources, including poorly 
constructed wells and the 
improper disposal of waste 
water produced during the 
drilling process. But agency 
officials said that the number 
of cases of documented 
contamination is “relatively 
low”. Accompanied by more 
than 20 peer-reviewed 
reports, the study will be 
finalized after review by the 
agency's Science Advisory 
Board. 


Water-lily discovery 
The discovery of a new species 
of water lily was announced 

by the Royal Botanic Gardens, 
Kew, UK, on 5 June. The 
bloom (pictured), which has 
yet to be named, was found 

in Western Australia by Kew 
tropical-plant specialist 


Although big gains have been 


made in cutting mortality due to 


illness, disabilities from illness 
pose a major problem, warns 
a study of 188 countries in The 


Lancet (T. Vos et al. Lancet http:// 


doi.org/45v; 2015). It found 
that the proportion of years of 


healthy life lost rose from about 


one-fifth in 1990 to nearly 
one-third in 2013. The leading 


causes — including low-back pain 


and depression — have changed 
little, but only 1 in 20 people had 
no problems in 2013, and 1 in3 
had more than 5. 


Carlos Magdalena and teams 
from Kings Park and Botanic 
Garden and the University 
of Western Australia, both in 
Perth. An identical plant had 
previously been collected in 
Australia’s Northern Territory, 
but was thought to bea 
hybrid. The discovery of this 
plant in the remote creeks of 
Kimberley, many hundreds 
of kilometres away from the 
previous find, led Magdalena 
to conclude that it is anew 
species. DNA analysis will be 
used to confirm the find. 


POLICY 


G7 climate pledge 


On 8 June, the G7 group 

of leading industrialized 
nations adopted a target of 
reducing global greenhouse- 
gas emissions to 70% of their 
2010 levels by mid-century 
to achieve climate goals. 
Issued at the conclusion of 
the G7 summit at Schloss 


GROWING SICKNESS 


Elmau in Kriin, Germany, the 
communiqué goes beyond 
previous statements. Those had 
affirmed the goal of limiting the 
average temperature increase 
to 2°C above pre-industrial 
levels, but had not quantified 
the actual emissions reductions 
required. The G7 also called for 
decarbonization of the world 
economy — ending fossil-fuel 
use — by the end of the twenty- 
first century. 


Animal experiments 


The European Commission 
has decided not to propose 
changes to legislation on the 
use of animals for scientific 
purposes, after considering a 
petition to overhaul its laws. A 
European Citizens’ Initiative 
called Stop Vivisection, signed 
by more than one million 
Europeans and submitted to 
the commission in March, had 
called for a complete ban on 
animal use. The commission 
said on 3 June that animals are 


Although people are living longer, they are also living with more 
chronic conditions, as seen here in data for the developed world. 
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COMING | 
14-17 JUNE 

The latest research 
on human fertility 
and reproductive 
technologies is 
presented at the 
European Society of 
Reproduction and 
Embryology’s annual 
conference in Lisbon. 
www.eshre2015.eu 


15-19 JUNE 
Interdisciplinary 
scientists from around 
the world meet in 
Chicago, Illinois, for the 
Astrobiology Science 
Conference 2015. 

This year’s theme is 
Habitability, Habitable 
Worlds, and Life. 
go.nature.com/hwhc8z 


still required in biomedical 
research, but it pledged to 
promote the development of 
animal-free ways of testing the 
safety and efficacy of drugs 
and other chemicals. 


Robot contest 


A South Korean team won 
the US$2-million first prize 
in the US Defense Advanced 
Research Projects Agency’s 
DARPA Robotics Challenge, 
which took place on 5-6 June 
in Pomona, California. 
Twenty-three teams from 
around the world entered 
machines into the contest, in 
which they had to compete at 
performing tasks that could be 
useful in a disaster scenario, 
such as climbing a ladder 
and shutting down a valve. 
The winner was DRC-Hubo, 
a humanoid robot that can 
switch between walking and 
moving faster on wheels. 

It was developed at KAIST, 

a science and technology 
research university in 
Daejeon, South Korea. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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Injectable 
brain implant paves way for 
neuroscience revolution p.137 


Boston biotech hub 
threatened by soaring rents 
and influx of giant firms p.138 


Experts 
urge focus on Middle East to 
prevent future outbreaks p.139 


Why Africa’s 
super vegetables are 
on the rise p.146 


HOMOSEXUALITY 


A 2014 Ugandan law that punished homosexuality with life imprisonment was later repealed. 


African academics challenge 
homophobic laws 


Scientific report demolishes assertions used to back criminalization of homosexuality. 


BY LINDA NORDLING 


Western import. Unnatural. Contagious. 
Ava All of these arguments and 
more have been invoked to support the 
numerous laws criminalizing homosexuality in 
Africa. But now African academics have used 
scientific evidence to argue against such laws 
and to urge African nations to abandon them. 
Ina report published on 10 June by the Acad- 
emy of Science of South Africa (ASSAf), the 


academics, most of whom are scientists, make 
the case that laws criminalizing homosexuality 
have no basis in science and hamper efforts to 
prevent and treat HIV and other sexually trans- 
mitted infections (see go.nature.com/q3rr4k). 

Human-rights activists and many people 
working in health care welcome the report, 
which comes in response to a slew of homo- 
phobic laws in several African countries, in 
particular Uganda. “It opens up a new outlook 
about homosexuality seen through the lens 
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of science,” says Thomas Egwang, a Ugandan 
immunologist who was not involved in produc- 
ing the report. 

However, activists and health-care workers 
also warn that it will probably not have a big 
impact on policy-making — at least in the 
short term — because homophobic attitudes 
are deeply entrenched. “I’m not sure the [Ugan- 
dan] government is going to listen,” says Kent 
Klindera, programme manager of an initiative 
at the Foundation for AIDS Research in 


11 JUNE 2015 | VOL 522 | NATURE | 135 


NIAID/CDC 


| NEWS IN FOCUS 


> New York that promotes HIV prevention for 
gay men, transgender individuals and men who 
have sex with men (MSM). 

Same-sex relationships are illegal in 38 of 
Africa's 53 nations and punishable by death in 4. 
In recent years, anti-gay sentiment has 
intensified. In February 2014, Uganda, 
where homosexuality was already ille- 
gal, passed a law that made it punish- 
able by life imprisonment and made the 
‘promotion’ of homosexuality a crime. 
In the same year, Gambia enacted a 
similar law, and Nigeria passed a law 
banning same-sex marriage, gay-rights 
groups and displays of same-sex affec- 
tion in public. 

Although the Ugandan law has 
since been repealed (a constitutional 
court declared that there had not been 
enough members present when parlia- 
ment passed the act), it prompted an 
international outcry — and sparked 
the idea for the report among members 
of ASSAF. “It had to be Africa-led,” says 
Glenda Gray, co-chair of the report 
panel. “If it were American-led, Afri- 
can governments would say that it was 
Western propaganda.” 

The report’s authors, of whom 11 
work in South Africa, 1 in Uganda and 
1 in the United States, drew on medi- 
cine, anthropology, psychology and philoso- 
phy to counter arguments used to justify the 
criminalization of homosexuality. 

When signing the Ugandan bill into law, 
the country’s president, Yoweri Museveni, 
said that it would stop the “social imperial- 
ism” of the West, which he said is responsible 
for promoting homosexuality in Africa. Itis an 
argument also used by many of the continent's 
faith-leaders. But the report argues that it was 
missionaries from the West who turned tradi- 
tional African practices, including same-sex 
relationships, into problems. 

The authors reference a review of literature 
dating back to the late nineteenth century 
that documents homosexuality in Africa (see 
go.nature.com/7un5vx), including woman- 
to-woman marriage on the Slave Coast (in 
present-day Togo, Benin and Nigeria), homo- 
sexual relations between shepherd boys in 
Ethiopia, and cross-dressing male prostitutes 
in Senegal. The authors cite a 2008 review that 
calculated the prevalence of MSM in Africa 


to be at least 2%. This is in line with a global 
finding that at least 1.5% of men of any 
given population are exclusively attracted to 
members of their own sex. 

The report also counters the notion that 


President Yoweri Museveni signed Uganda’s homophobic bill into law. 


homosexuality is unnatural, citing evidence for 
a strong biological role in sexual orientation. 
And it tears down the ideas that homosexuality 
is ‘socially contagious, promotes the spread of 
HIV or encourages paedophilia, citing papers 
that have disproved these, or similar, claims. 

The report warns that criminalizing homo- 
sexuality is the real threat to public health. This 
was illustrated in Uganda when a staff member 
ofa US—Ugandan research project at Makerere 
University was arrested on charges of recruit- 
ing homosexuals and carrying out ‘unethical 
research’: the US funder suspended the project 
because of fears about staff safety (see Nature 
509, 274-275; 2014). 

The authors hope that the report will fare 
better than an earlier one. Shortly before par- 
liament was to vote on the 2014 bill, the Uganda 
National Academy of Sciences (UNAS) declined 
a government request to produce a report on 
the scientific basis of homosexuality because 
of insufficient time. But a group of scientists 
hastily produced one that was later misquoted 


by members of parliament who supported 
the bill. 

The UNAS has endorsed the latest report — 
unlike other African academies invited to do 
so. “The whole question around homosexual- 
ity has become fraught,’ says Melissa 
Steyn, an author of the report and a 
social scientist who holds the South 
African Research Chair in Critical 
Diversity Studies at the University of 
the Witwatersrand in Johannesburg. 
UNAS president Nelson Sewankambo 
says that it is the role of the academy to 
engage with controversial issues, espe- 
cially where scientific arguments are 
already being used. However, he does 
not expect all of his colleagues to sup- 
port the report's sentiments. 

Juliet Kiguli, an anthropologist at 
Makerere University and the only 
Ugandan on the expert panel, is hope- 
ful that the report will change views in 
her country. “The majority of Ugan- 
dans think this is not part of our cul- 
ture. But culture is not static,” she says. 
If policy-makers take a lead in forging 
a more tolerant society, public opinion 
will follow, she says. 

The report could help human-rights 
activists across Africa to lobby policy- 
makers, says Daniel Onyango, execu- 
tive director of the Nyanza, Rift Valley and 
Western Kenya Network, which campaigns 
for gay rights from Kisumu, Kenya. The 
UNAS’s endorsement of the report might ease 
its acceptance by Kenyans, he adds, given that 
they see Uganda as culturally closer than South 
Africa and that ASSAf published the report. 
Abandoning anti-gay laws would be a major 
step towards offering good-quality health care 
and tailored HIV-prevention information to 
MSM, says Eduard Sanders, an epidemiologist 
at the University of Oxford, UK, who has done 
HIV research on MSM in Kenya. “The road is 
likely to take many years, but there is growing 
momentum in Africa — and I am proud to be 
part of it” 

At least one Ugandan activist has no qualms 
about quoting the report to deter his detrac- 
tors: “As an activist,” says Paul Semugoma, a gay 
Ugandan physician who lives in exile in South 
Africa, and who campaigns for gay rights in 
Uganda, “I plan to get hold ofit, and use it akin 
to how they use the Bible on us.” = 
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NANOTECHNOLOGY 


IN FOCUS | NEWS 


Injectable brain implant 
Spies on individual neurons 


Electronic mesh has potential to unravel workings of mammalian brain. 


BY ELIZABETH GIBNEY 


simple injection is now all it takes 
Ae wire up a brain. A diverse team 

of physicists, neuroscientists and 
chemists has implanted mouse brains with a 
rolled-up, silky mesh studded with tiny elec- 
tronic devices, and shown that it unfurls to spy 
on and stimulate individual neurons. 

The implant has the potential to unravel the 
workings of the mammalian brain in unprec- 
edented detail. “I think it’s great, a very crea- 
tive new approach to the problem of recording 
from large number of neurons in the brain,” 
says Rafael Yuste, director of the Neuro- 
technology Center at Columbia University in 
New York, who was not involved in the work. 

If eventually shown to be safe, the soft mesh 
might even be used in humans to treat condi- 
tions such as Parkinson's disease, says Charles 
Lieber, a chemist at Harvard University on 
Cambridge, Massachusetts, who led the team. 
The work was published in Nature Nanotech- 
nology on 8 June’. 

Neuroscientists still do not understand how 
the activities of individual brain cells translate 
to higher cognitive powers such as perception 
and emotion. The problem has spurred a hunt 
for technologies that will allow scientists to 
study thousands, or ideally millions, of neu- 
rons at once, but the use of brain implants is 
currently limited by several disadvantages. 
So far, even the best technologies have been 
composed of relatively rigid electronics that act 
like sandpaper on delicate neurons. They also 
struggle to track the same neuron over a long 
period, because individual cells move when an 
animal breathes or its heart beats. 


BUGGING THE BRAIN 


An interdisciplinary team has 
created a rolled mesh of electronics 
that can be injected into a mouse 
skull. Once there, it unfurls and 
melds with the brain tissue. 


——~ 
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This soft, conductive polymer mesh can be rolled up and injected into the brains of mice. 


The Harvard team solved these problems by 
using a mesh of conductive polymer threads 
with either nanoscale electrodes or transistors 
attached at their intersections. Each strand is as 
soft as silk and as flexible as brain tissue itself. 
Free space makes up 95% of the mesh, allowing 
cells to arrange themselves around it. 

In 2012, the team showed’ that living cells 
grown ina dish can be coaxed to grow around 
these flexible scaffolds and meld with them, 
but this ‘cyborg’ tissue was created outside a 
living body. “The problem is, how do you get 
that into an existing brain?” says Lieber. 

The team’s answer was to tightly roll up a 
2D mesh a few centimetres wide and then use 
a needle just 100 micrometres in diameter to 
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inject it directly into a target region through a 
hole in the top of the skull. The mesh unrolls 
to fill any small cavities and mingles with the 
tissue (see ‘Bugging the brain’). Nanowires that 
poke out can be connected to a computer to 
take recordings and stimulate cells. 

So far, the researchers have implanted 
meshes consisting of 16 electrical elements into 
two brain regions of anaesthetized mice, where 
they were able to both monitor and stimulate 
individual neurons. The mesh integrates tightly 
with the neural cells, says Jia Liu, amember of 
the Harvard team, with no signs of an elevated 
immune response after five weeks. Neurons 
“look at this polymer network as friendly, like 
a scaffold’, he says. 

The next steps will be to implant larger 
meshes containing hundreds of devices, with 
different kinds of sensors, and to record activ- 
ity in mice that are awake, either by fixing 
their heads in place, or by developing wireless 
technologies that would record from neurons 
as the animals moved freely. The team would 
also like to inject the device into the brains of 
newborn mice, where it would unfold further 
as the brain grew, and to add hairpin-shaped 
nanowire probes to the mesh to record electri- 
cal activity inside and outside cells. 

When Lieber presented the work at a con- 
ference in 2014, it “left a few of us with our 
jaws dropping’, says Yuste. > 
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> There is huge potential for techniques 
that can study the activity of large numbers 
of neurons for a long period of time with only 
minimal damage, says Jens Schouenborg, 
head of the Neuronano Research Centre at 
Lund University in Sweden, who has devel- 
oped a gelatin-based ‘needle’ for deliver- 
ing electrodes to the brain’. But he remains 
sceptical of this technique: “I would like to 
see more evidence of the implant’s long- 
term compatibility with the body,” he says. 


Rigorous testing would be needed before 
such a device could be implanted in people. 
But, says Lieber, it could potentially treat 
brain damage caused by a stroke, as well as 
Parkinson's disease. 

Lieber’s team is not funded by the US 
government's US$4.5-billion Brain Research 
through Advancing Innovative Neurotech- 
nologies (BRAIN) initiative, launched in 2013, 
but the work points to the power of that effort’s 
multidisciplinary approach, says Yuste, who 


was an early proponent of the BRAIN initia- 
tive. Bringing physical scientists into neurosci- 
ence, he says, could help to “break through the 
major experimental and theoretical challenges 
that we have to conquer in order to understand 
how the brain works”. = 
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Start-ups fight for a place in 
Boston’s biotech hub 


Competition for lab space threatens Kendall Square’s innovative spark. 


BY HEIDI LEDFORD 


s venture capitalist Kevin Bitterman gets 
A=" to launch his latest start-up, he 
ows exactly where he wants it to be: 
Kendall Square, a densely populated neighbour- 
hood in Cambridge and the heart of Boston's 
booming biotechnology industry. “You cant 
walk two feet there without seeing someone in 
biotech, he says. “That kinetic energy of having 
everybody squished together — it leads to a lot 
of advantages you can't get outside the city.” 

But Bitterman, a partner at the firm Polaris 
Partners in Boston, Massachusetts, finds him- 
self contemplating the unthinkable: exile to the 
suburbs. Space in Kendall Square has always 
been tight, but now it is nearly impossible to 
find — particularly for young start-ups, he 
says. Two years ago, when he sought a home for 
another fledgling firm, Bitterman says he could 
count the options in the area on one hand. “But 
at least we had one hand of options to look at? 
he says. “There's literally nothing now” 

The Boston-area biotech community is 
among the largest and densest in the world, 
with Kendall Square at its epicentre. The 
neighbourhood squeezes 120 biomedical firms 
within a 1.5-kilometre radius. The density and 
diversity of the biotech ecosystem make Ken- 
dall unique, says Fiona Murray, associate dean 
for innovation at the nearby Massachusetts 
Institute of Technology (MIT) Sloan School of 
Management. The area is home to biomedical 
firms large and small, but also to the investors, 
patent lawyers, contract research organiza- 
tions and suppliers they need to support them. 
“There’s an extraordinary supply of human 
talent,’ says Murray. 

But some say that the region may become 
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Technology and drug firms dominate the Kendall Square neighbourhood of Cambridge, Massachusetts. 


a victim of its own success. Since the 1970s, 
Kendall Square has evolved from an industrial 
centre dotted with brick factory buildings into 
a hub of high-tech and biotech firms such as 
Biogen and Genzyme. Over the past five years, 
the technology giants Google and Microsoft 
and the multinational drug firms Novartis and 
Pfizer have dramatically expanded their offices 
and laboratories in the neighbourhood, eager to 
build close relationships with hot start-ups and 
the academic powerhouses of Harvard Univer- 
sity and MIT. The influx threatens to squeeze 
out the start-up companies that have helped to 
make Kendall Square what it is. “There are so 
many benefits of being right here,’ says Chuck 
Wilson, who started the cancer-focused firm 
Unum Therapeutics in Kendall Square last year. 
The firm is growing so quickly that it will almost 
certainly need to leave the neighbourhood 
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within months to get enough space. 

At the same time, more-mature homegrown 
companies such as Alnylam Pharmaceuticals, 
which develops RNA-based therapies, and 
cancer-drug developer ARIAD Pharmaceuti- 
cals are expanding. “Companies that were only 
founded a couple of years ago are now gobbling 
up a lot of space,” says Eric Smith, a partner in 
the Boston office of the commercial-property 
firm Transwestern. 

Young biotechs also compete with high-tech 
firms so desperate for office space in Cam- 
bridge that they are renting buildings designed 
to accommodate labs. Since the end of 2012, lab 
rents in the Kendall Square area have risen by 
13% to a monthly rate of just over US$770 per 
square metre (see “Up and away’). Over the same 
period, monthly office rents rose by 26% to just 
over $700 per square metre. 
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Alexandria Real Estate Equities, one of 
the largest owners of commercial lab space 
in the region, says that 99% ofits Cambridge 
properties are occupied. In such a competi- 
tive market, most landlords will choose 
established tenants over potentially unstable 
start-ups. Although Cambridge has added 
465,000 square metres of lab space since 
2007, most of that has gone to large firms, 
says Peter Abair, director of economic devel- 
opment and global affairs at the Massachu- 
setts Biotechnology Council in Cambridge. 


SHARED SPACES 
The local community recognizes that start- 
ups need to be nurtured for the biotech hub 
to thrive, says Peter Parker, a co-founder 
of LabCentral in Kendall Square. One of 
several local projects created to provide lab 
space and equipment to help start-ups get off 
the ground quickly, LabCentral receives state 
funding as well as corporate sponsorship 
from large pharmaceutical firms. It plans to 
double its occupancy in the next two years. 
MITIMCo, a division of MIT that manages 
the institution’s sizeable property holdings, 
has also committed to housing start-ups. 
But start-ups may disperse to the suburbs 
anyway, says José Lobo, who studies urban 
economies at Arizona State University in 
Tempe. Kendall Square's story is an anomaly, 
he says — urban centres around the world 
have tried to replicate it, mostly without suc- 
cess. And in places where biotech is thriving, 
such as the San Francisco Bay Area in Cali- 
fornia and the outskirts of Washington DC, 
it is more spread out. Having run out of 
affordable space in Kendall Square, Boston's 
biotech firms may soon follow that pattern, 
he says: “I wont be surprised if they leave.” 
Some already have. Wilson says that 
Unum Therapeutics is now close to signing 
a lease on the outskirts of Cambridge. He 
suspects that more will do the same, forming 
new start-up clusters in the suburbs. “We'll 
see how it evolves,’ he says. “Butit'll bea very 
different feel” m 


UP AND AWAY 


The price of lab space in the Kendall Square 
region of Cambridge, Massachusetts, has risen 
dramatically over the past decade as drug and 
biotech companies vie to make it their home. 
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A medical worker in South Korea handles a sample from a man suspected of having the MERS virus. 
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MERS cases spotlight 
lack of research 


Outbreak of Middle East respiratory syndrome in South 
Koreais controllable, but how it infects humans is a puzzle. 


BY DECLAN BUTLER 


r Vhe world is watching South Korea as the 
latest outbreak of Middle East respira- 
tory syndrome (MERS) unfolds. But 

how exactly the virus jumps to humans in the 

first place is still unknown, and clues to that 
puzzle lie thousands of kilometres away. 

As Nature went to press, the cluster of 
hospital-associated cases in South Korea — 
the largest MERS outbreak outside the Middle 
East — had killed 7 people and infected 95, 
according to the World Health Organization 
(WHO). Hundreds of schools have been shut. 
Although the causal coronavirus, MERS-CoV, 
is considered a potential pandemic threat, spe- 
cialists told Nature that they expect authorities 
to quickly bring this outbreak under control. 

A much bigger challenge than emergency 
response, they say, is how to stop MERS being 
transmitted from animals to people in the 
Middle East, where it is endemic in camels. 
“The focus on South Korea would be better 
directed towards Saudi Arabia,’ says David 
Heymann, a researcher at the London School 
of Hygiene and Tropical Medicine and chair of 
Public Health England, to stop the cases that 
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continue to spark new outbreaks at the source. 

Since it was first detected in Saudi Arabia 
in 2012, MERS-CoV has infected around 
1,200 people worldwide, roughly 450 of whom 
have died, according to the WHO. The virus 
is thought to originate in bats and to jump to 
humans through an intermediate animal, such 
as camels. It does not easily spread between 
people, partly because it infects deep areas of 
the lungs, and is not coughed out. Most of the 
human infections, however, were the result of 
human-to-human spread, which can occur 
in hospitals when certain medical procedures 
combine with poor infection control to dissemi- 
nate the virus. The latest clusters began when 
a South Korean man returned to Seoul from 
the Middle East, and visited four health-care 
facilities before he was diagnosed. 

There is always a chance that as the virus 
spreads, it could acquire mutations that allow 
it to spread more easily between humans. But 
on 6 June, the South Korean health ministry 
announced that it had sequenced the virus in 
the current outbreak and that it was almost 
identical to past sequences from the Middle 
East. On the same day, the Chinese Center for 
Disease Control and Prevention posted a > 
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separate sequence to the publicly avail- 
able GenBank database, from a man infected 
in the South Korean outbreak who then trav- 
elled to China, where he felt ill. Christian 
Drosten, director of the Institute of Virology 
at the University of Bonn Medical Centre in 
Germany has analysed this sequence and 
says that it is shows only minor mutations 
compared with Middle Eastern strains, none 
in areas of the genome thought to influence 
infectiousness. 

A stream of new cases in South Korea 
might create the impression that the disease 
is out of control. But all cases reported so far 
have clear transmission routes from the ini- 
tial infection, says Ian Lipkin, an outbreak 
specialist at Columbia University in New 
York. The country is now intensively tracing 
and isolating the contacts of those infected, 
and implementing strict infection controls 
in hospitals. Were cases springing up outside 
of hospitals that would be cause for worry, 
but that is not happening, says Lipkin. 

In the Middle East, however, the virus 
continues to jump from camels to humans 
leading to hospital outbreaks. Heymann, 
who in 2003 led the global effort to contain 
severe acute respiratory syndrome, or SARS, 
says that authorities in the Middle East 
should do more to investigate how people 
catch the virus from camels. 

Such studies would involve investigat- 
ing the recent activities of infected people, 
finding out, for instance, whether they had 
had contact with animal carcasses or bodily 
fluids, had consumed fluids such as camel 
milk or urine, or had been near bat colonies. 
“Tt’s frustrating that all cases from animal 
infections have not been properly inves- 
tigated,” says Peter Ben Embarek, leader 
of the WHO’s MERS team at the agency’s 
headquarters in Geneva, Switzerland. One 
obstacle is cultural, in that Saudis tend to 
be averse to discussing what they consider 
private matters, he says. 

The outbreak in South Korea will 
probably put pressure on Middle Eastern 
countries to accelerate research and control 
of MERS, says Drosten. 

Another outstanding mystery is why 
human cases have not been detected in 
African countries with large camel popu- 
lations: Somalia has 7 million camels, and 
Kenya 3 million, dwarfing Saudi Arabia's 
population of 260,000. “MERS is circu- 
lating in camels in many parts of Africa,” 
says Ben Embarek, “so camel-wise, it’s the 
same picture as in the Middle East.” One 
possibility is that human cases are going 
undetected because of poor surveillance. 
Another possibility is that cases in Africa 
are less likely or less serious, because MERS 
tends to cause serious illness only in people 
who have diseases that result from modern 
lifestyles, such as diabetes, which are more 
common in Saudi Arabia. m 
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The Yamnaya people are thought to have carried their burial practices and other traditions into Europe. 


NATALIA SHISHLINA 


DNA deluge reveals 
Bronze Age secrets 


Population-scale studies of ancient genomes hint at roots of 


technology, languages and diet. 


BY EWEN CALLAWAY 


nly half'a decade after a 4,000-year-old 

tuft of hair yielded the first ancient- 

human genome’, researchers are start- 
ing to sequence ancient genomes by the dozen, 
much as they do with modern genomes. 

Such population-scale sequencing is answer- 
ing long-standing questions about the Eurasian 
Bronze Age. This tumultuous period between 
about 3000 Bc and 1000 Bc saw new technolo- 
gies and cultural traditions — from the use 
of finely crafted weaponry and horse-drawn 
chariots to changes in burial practices — spread 
across Europe and Asia, starting in the steppe 
between the Black Sea and the Caspian Sea. 

As DNA data flood in, researchers say, the 
mass-genome approach will paint an increas- 
ingly accurate picture of the past and show 
how ancient events shaped modern humanity 
— from what we eat to the diseases that ail us. 
“Christ, what does this mean?” says Greger 
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Larson, an evolutionary geneticist at the 
University of Oxford, UK. “In another five 
years, we'll be talking about tens of thousands 
of ancient genomes.” 

The dawn of ancient population genomics 
is the result of cheap DNA sequencing and 
the rise of boutique lab techniques that can 
separate highly degraded ancient DNA from 
contemporary contaminants. 

A team led by palaeogenomicists Morten 
Allentoft and Eske Willerslev at the Natural His- 
tory Museum of Denmark in Copenhagen has 
used these advances to sequence the genomes 
of 101 people who lived across Eurasia between 
about 3000 Bc and ap 700 (ref. 2). “We could 
have stopped at 80,” says Allentoft. But “we 
thought, “Why the hell not? Let's go above 100.” 

The sequences allowed the team to tackle 
questions that have vexed archaeologists for 
decades, says Allentoft. For example, research- 
ers have disagreed over whether the cultural 
changes of the Bronze Age were the result 


of migration or simply the spread of ideas. 
Allentoft and his colleagues found evidence for 
migration, in the form of a massive shift in the 
genetic make-up of northern and central Euro- 
peans at the start of the Bronze Age. Before 
3000 Bc, their genomes resembled those of 
early farmers from the Middle East and even 
earlier European hunter-gatherers. By 2000 Bc, 
their genomes looked more like those of people 
from the Yamnaya culture, which arose on the 
steppe around 2900 Bc. 

The findings echo those of a team that 
sequenced 69 ancient Europeans’. Both 
groups speculate that the Yamnaya migration 
was at least partly responsible for the spread 
of the Indo-European languages into Western 
Europe. 

Allentoft’s team found genetic traces of the 
Yamnaya in people who lived near the Altai 
Mountains in central Russia from 2900 Bc 
to 2500 Bc, potentially explaining why Indo- 
European languages are spoken so far into Asia. 
“It’s pretty clear that these eastern cultures in 
the Bronze Age are linked to the Yamnaya,’ 
says Pontus Skoglund, a population geneticist 
at Harvard Medical School in Boston, Massa- 
chusetts. But he is not yet convinced that the 
culture’s wanderings explain the origins of all 
Indo-European languages. 

Ancient population genomics also offer 
insights on physical and physiological traits. 


Allentoft’s team found that the ability to digest 
milk into adulthood — nearly universal in 
northern Europeans today — was rare in Bronze 
Age Europeans, contradicting earlier claims that 
the trait helped early European farmers to gain 
calories from milk. Of the 101 sequenced indi- 
viduals, the Yamnaya were most likely to have 
the DNA variation responsible for lactose tol- 
erance, hinting that the steppe migrants might 
have eventually introduced the trait to Europe. 

Another team has analysed* DNA from 
83 ancient Europeans and discovered that a 
mutation linked to thick hair and numerous 
sweat glands, once thought to have emerged in 
East Asians, was common in Scandinavians as 
early as 7,700 years ago — potentially revealing a 
connection between these groups. That analysis, 
posted on the preprint server bioRxiv in March, 
also found evidence of evolutionary pressure on 
height: Iberians seem to have become shorter 
after farming arrived in what is now Spain and 
Portugal 8,000 years ago, whereas the Yamnaya 
who migrated out of the steppe appear to have 
been taller than their neighbours. 

In future, researchers are likely to probe 
genomes to see how past events shaped mod- 
ern susceptibility to disease, says Larson. For 
instance, survivors of the fourteenth century 
Black Death, which killed around half of 
Europeans, may have carried gene variants 
that protect against certain infections. 
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“It’s an interesting time, because the 
technology is moving faster than our ability 
to ask questions of it,” says Larson, whose lab 
has also amassed around 4,000 samples from 
ancient dogs and wolves to chart the origins of 
domestic dogs. “Let’s just sequence everything 
and ask questions later.” m SEE NEWS & VIEWS P.164 
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CLARIFICATIONS 

It could have been clearer in the News story 
‘Ebola R&D woes spur action’ (Nature 521, 
405-406; 2015) that the US Food and Drug 
Administration was not consulted about or 
involved in the design of the brincidofovir 
study; the agency advocated generally for 
randomized clinical trials on drugs against 
Ebola. In the News Feature ‘CRISPR, the 
disruptor’ (Nature 522, 20-24; 2015), the 
June 2012 entry in the graphic entitled 

‘The rise of CRISPR’ was ambiguous. The 
researchers had targeted the CRISPR 
system to specific DNA sequences, 
highlighting its potential for genome editing. 


© 2015 Macmillan Publishers Limited. All rights reserved 


THE MILITARY-BIOSCIENCE 


OMPLEX 


THE US DEPARTMENT OF DEFENSE {S MAKING A BIG PUSH INTO 
BIOLOGICAL RESEARCH — BUT SOME SCIENTISTS QUESTION WHETHER 


ITS HIGH-RISK APPROACH CAN WORK. 


BY SARA REARDON 


hen Geoffrey Ling talks about the future of technology, his 

ideas go flying around the room like a whirlwind. Ling eagerly 

describes a world in which people live far beyond their natural 

lifespans, minds can be downloaded into external ‘hard drives’ 
for enhancement by artificial intelligence and robots and aircraft are 
controlled by human thought. 

“It's abso-posi-frickin-lutely going to happen,” he declares. “The 
next 20 years are going to make our heads spin, because we've already 
crossed over into that realm” 

Ling should know: he is doing as much as anyone to make these 
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visions real. A neurologist by training, he is also a US Army colonel and 
director of the first biology funding office to operate within the Defense 
Advanced Research Projects Agency (DARPA), the Pentagon’s avant- 
garde research arm. The Biological Technologies Office (BTO), which 
opened in April 2014, aims to support extremely ambitious — some 
say fantastical — technologies ranging from powered exoskeletons for 
soldiers to brain implants that can control mental disorders. 

DARPAs plan for tackling such projects is being carried out in the same 
frenetic style that has defined the agency’s research in other fields. Ever 
since it was created in 1958, a year after the Soviet Union beat the United 
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States into space by launching the world’s first 
artificial satellite, Sputnik, the agency's mission 
has been to prevent any more such surprises 
by getting there first. So DARPA’ programme 
managers at the BTO are free to pour tens of 
millions of dollars into ambitious projects 
without waiting around for niceties such as 
peer review. And by working closely with its contractors as they develop 
their technology, the agency aims to drive discoveries across the often- 
deadly gap between basic research and commercialization. 

That aggressive, high-risk strategy has had spectacular pay- 
offs — most famously with the agency’s development of the Internet 
in the 1970s. And that has happened often enough to inspire imitators 
such as ARPA-E, a branch of the US Department of Energy that is 
devoted to high-risk research into alternative energy sources. 

But some wonder whether DARPAs full-speed-ahead model will 
work as well for biology as it has for the physical sciences and hardware. 
Living systems are much more complex, they argue, with a multitude of 
variables that are either unknown or difficult to engineer and control. 
And because so much of the agency’s biological research is directly 
applicable to humans, the work is fraught with ethical concerns — 
not to mention the possibility that even the most benign-sounding 
developments could be co-opted for war. Synthetic organisms designed 
to produce greener biofuels could also make explosives, for example, 
and brain-stimulation technology intended to heal wounded soldiers 
could also enhance combat abilities. 

Edward Hammond, a biology-policy consultant in Austin, Texas, 
wonders whether the agency often has ulterior motives when it con- 
tracts researchers. “You don’t ever really know what DARPA wants,’ 
he says. “But they’re pretty good at finding people who are resolving 
questions they’re interested in for other reasons.” 

Still, many biologists are willing to accept money from the 
Department of Defense (DOD), on the grounds that innovations 
such as better prosthetics and improved mental-health treatments 
are needed no matter who is paying for them. And Ling insists that 
DARPA understands the concerns: every programme in the BTO hasa 
bioethics advisory board. Besides, he says, if visionary biotechnologies 
are inevitable, then it is DARPA’s duty to race ahead and invent them. 

“Some people think it’s scary,” he says, contemplating that future. 
“But I think it’s rather exhilarating” 


A prosthetic arm 
developed at Johns 
Hopkins University and 
funded by DARPA can 
be controlled by the 
wearer’s own nerves. 


TIME TO COMBINE 

DARPAs embrace of bioscience began in earnest in 2001, when anthrax 
spores posted to media offices and members of the US Congress 
brought concerns about bioterrorism to the fore. Then came the wars 
in Afghanistan and Iraq, which led the agency to invest in fields such 
as neuroscience, psychology and brain-computer interfaces — all 
with the intention of helping injured veterans. By 2013, the number of 
biology-related programmes had grown such that DARPA decided to 
consolidate them under one roof. The natural choice to head the new 
office and its US$288-million annual budget was Ling, who was deputy 
director of DARPA’ science division at the time. 

The office will certainly speed up research, says George Dyson, an 
independent science historian in Bellingham, Washington, and not 
least because of the military’s culture of completing missions quickly, 
without lengthy reflection or debate. Looking at what DARPA has 
already done in fields such as computing, says Dyson, “it’s always the 
military who move quickly enough to fund the interesting things”. 

A good example is DARPA’ reaction to US President Barack Obama's 
2013 announcement of the BRAIN initiative: a high-profile, multi- 
agency effort to understand the circuitry of the 


brain. The National Institutes of Health (NIH) DNATURE.COM 
spent months designing a ten-year strategic plan Toheara podcast 
for the initiative before distributing its share of _onDARPA’s biology 
the money, and the National Science Founda- push, see: 


tion (NSF) opened a competition foritsspending —_go.nature.com/cotlqx 
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share to any research project related to brain networks. But DARPA 
quickly funnelled more than $50 million into just a few five-year 
programmes. 

These efforts now fall under the remit of the BTO. One, called 
Restoring Active Memory, is attempting to create a stimulation device 
that restores soldiers’ ability to form memories after brain damage. 
Another, called SUBNETS (System-Based Neurotechnology for 
Emerging Therapies), is developing a brain implant that can treat seven 
mental and neurological disorders. As a first step, both projects are 
monitoring the brain activity of people with epilepsy who have had 
temporary electrodes implanted to locate the origin of their seizures. 
The investigators ask these patients to carry out memory exercises, or 
to perform tasks that involve neural pathways that might be impaired 
in addiction or depression, and record the electrical patterns that result. 

The pay-offs could still be some way off, however. “There’s no 
question this is a very ambitious goal,” says Edward Chang, a neuro- 
surgeon at the University of California, San Francisco, who co-leads 
one of the SUBNETS teams. “I don’t think anyone is naive enough to 
think they'll be easily solved in the next five years.” 

As ambitious as DARPA is, however, its funding process can be 
unsettling for researchers who are accustomed to elaborately peer- 
reviewed grants from civilian agencies. At DARPA, much of the author- 
ity is vested in the programme managers, who rotate in and out from 
academia, industry and the armed services. They alone design the ini- 
tiatives, invite researchers to apply for contracts with specific goals and 
milestones and select the groups they think are most likely to achieve 
the goals. Then they work closely with the researchers to guide the 
project as it proceeds. 

DARPA calls its grant recipients ‘performers’ — and if they do not 
meet their milestones, the axe can fall quickly. In 2007, for example, 
DARPA started a programme called RealNose: an effort to develop a 
synthetic dog nose with real olfactory receptors for detecting odorants 
such as chemical weapons. But the agency killed the programme three 
years later, after it became clear that the receptor proteins were too 
unstable at room temperature. 


IT'S ALWAYS THE MILITARY WHO 
MOVE QUICKLY ENOUGH TO FUND THE 
INTERESTING THINGS.” 


Researchers who follow DARPAs choreography are almost always 
free to publish their results, says BTO deputy director Alicia Jackson: 
very few of the agency’s projects are classified as secret. But DARPA 
grant recipients do give up a certain amount of freedom: if they come 
across an interesting scientific question as they work, for example, 
they cannot use DARPA funding to pursue it. “Initially it was a change 
in culture,’ says Emad Eskandar, a neurosurgeon at Massachusetts 
General Hospital in Boston and director of one of the SUBNETS pro- 
grammes. But Eskandar and his partner, psychiatrist Darin Dough- 
erty, maintain that DARPA’ oversight has made the project better. “It’s 
helped us to focus and move ahead,’ Dougherty says. 

Certainly, Ling is determined to prove that DARPA’s model can work 
as well for biologists as for military contractors. One of his favourite 
successes is a prosthetic arm that DARPA developed in collaboration 
with the biotechnology firm DEKA in Manchester, New Hampshire. 
The device works by picking up the electrical signals that travel from 
the brain’s motor cortex into nerves in the stump, then translating those 
signals into the appropriate motions of the attached prosthetic hand. 
This allows wearers to perform difficult tasks such as handling soft 
fruit and even rock climbing. The device won approval from the US 
Food and Drug Administration last year — the first nerve-controlled 
prosthetic to do so — and the company says that it is now working on 
commercialization. Similar arms are being developed for DARPA at 
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to the ethicists who provide ongoing guidance on the implications of ¢ 
the BTO’s work. That is far beyond the level of scrutiny given to most 


An exoskeleton designed through DARPA’s Warrior Web programme 
enhances soldiers’ physical abilities. 


the Johns Hopkins University in Baltimore, Maryland, and elsewhere; 
all of them are also being tested in people with paralysis in the hope 
that brain implants can translate their intentions into electrical signals 
that drive the hand. 

The BTO has also taken over DARPAS health programmes, including 
one that is seeking to turn bacteria that prey on other bacteria into 
therapeutic antimicrobial agents. Other programmes have more obvi- 
ous military applications, such as an exoskeleton that boosts a soldier’s 
strength and speed. A programme called Narrative Networks studies 
how the brain reacts to different stories and arguments, which could be 
helpful for planning how to convince a disaster-stricken village to accept 
US military aid, or to turn terrorists away from their agendas. And sev- 
eral synthetic-biology initiatives are making biological systems that 
can be programmed to produce any compound a user wants, including 
some that do not exist in nature. These could include materials for mak- 
ing lightweight body armour, coatings for strengthening equipment, 
tissues that can be used to repair wounds, and more-efficient biofuels. 

Ling and his DARPA colleagues revel in such ideas — the more far 
out, the better. “We look for ways to say yes, not no,’ he says. 

For all its breakthrough successes, however, there is little evidence that 
DARPA’ fast-track approach is consistently any better than peer review at 
choosing winners. “They've been successful when they've been success- 
ful? says Jonathan Moreno, a bioethicist at the University of Pennsylvania 
in Philadelphia. A DARPA spokesperson says that the agency cannot 
determine how often goals are met or contracts are cancelled. One reason 
is that the goalposts keep moving: if a project starts to seem unfeasible, 
programme managers often change the criteria for success and salvage 
what they can rather than cancelling the contract. Another is that unlike 
civilian agencies such as the NSF and the NIH, DARPA does not make 
public the grants that it makes. Nor does it conduct internal analyses that 
could determine whether its programme managers are choosing the best 
teams and paying for the best science that they possibly could. 

“To me, that’s a big problem,” says Pierre Azoulay, an economist 
at Massachusetts Institute of Technology in Cambridge. “Pointing to 
great successes is not enough,’ he says. The agency’s idea of programme 
evaluation is “very much in the mode of, ‘Look, the Internet!” 

But Jackson literally laughs at the idea that the BTO should be more 
introspective. The office’s budget is 1% of the size of the NIH’s, she says, 
with little margin for overhead costs. And besides, she says, “we go with 
whoever can get the job done” — never mind factors such as experience 
or lab size. “I think we have a really good track record in our 50-plus-year 
history,’ she says. Listing DARPAs successes, she starts with the Internet. 

But if DARPA is not slowing down to evaluate its successes, is it 
evaluating the effect they could have on society? Ling says yes, pointing 
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NIH- and NSF-funded projects, notes James Giordano, a neuroethi- 
cist at Georgetown University in Washington DC, and an adviser on 
SUBNETS. Usually, these undergo ethics evaluation only at their 
beginning or end. Moreno agrees. “The irony is that people think the 
national-security world is so far behind the civilian world on these 
things,” he says. “But again and again, DOD has been ahead.” 

Nevertheless, some researchers continue to be sceptical. At Oregon 
Health & Science University in Portland, emeritus neuroscientist Curtis 
Bell worries that technologies such as brain stimulation could be used 
to subdue people, in a similar way to the prefrontal lobotomies that 
were used in the mid-twentieth century to deal with some troublesome 
prisoners. “You could imagine such things being more sophisticated 
nowadays,’ he says. “You wouldn't need to damage all the frontal lobes 
if you could go to a specific nucleus and alter someone’s personality.” 

Dyson points out that there is no guarantee that the Pentagon will 
actually listen to ethicists’ concerns — or to DARPAs. “Some of these 
technologies are absolutely fascinating and intriguing and hold all this 
promise for good, but they’re very close to being weaponized easily,” he 
says. And, says Moreno, although many people in the military think 
deeply about the implications of new technologies, the worry is that the 
political authorities above them may not allow them much freedom to 
slow down or change direction. 

Of particular concern are the BTO’s synthetic-biology programmes. 
The Pentagon has talked about engineering bacteria to clean up sites 
contaminated by radiation or chemical weapons, stoking fears that 


THE AGENCY S IDEA OF PROGRAMME 
EVALUATION IS “VERY MUCH IN THE 
MODE OF ‘LOOK, THE INTERNET!.” 


these organisms could get out of control when released into the 
environment. Although there is no reason to think that the United 
States is creating synthetic biological weapons, some fear even the inti- 
mation that microbes are strategically useful. “It’s sending a signal that 
there’s a role for synthetic-biology products for use in the field,’ says 
Hammond. “I would be concerned about that, and I’m concerned that 
DARPA doesn't seem to be.” 

But other researchers are more supportive of the BTO. Ultimately, 
says Giordano, it may not matter who funds the research and who 
accepts the funding, because anyone can use published research for 
their own ends. “Individuals who look at DOD funding as Darth Vader 
science don't recognize that any science can be channelled through 
Darth Vader channels.” 

That is exactly why Ling feels that DARPA needs to jump into 
controversial science without hesitation: if the United States does not 
do it, someone else will. “The only thing we can do is do the work, 
he says, “but do it in a way where we're thinking about the untold 
consequences and how to mitigate them.” 

Ling says that he plans to keep expanding his office over the next 
year — how far depends on funding — to anticipate surprises com- 
ing from any sector. The BTO currently has 11 programme managers 
specializing in fields from infectious disease to natural ecosystems, and 
is looking to expand its repertoire to even more-far-flung fields such 
as palaeontology and astronomy. An expert in exoplanets, Ling says, 
could develop projects in preparation for the possibility of threats from 
outer space as well as the more likely scenario that signs of life will be 
discovered on another planet. “That is without a doubt going to be 
the most exciting scientific news in the history of mankind,” he says. 
“And Id love for it to be funded by DARPA.” m 


Sara Reardon is a reporter for Nature in Washington DC. 
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Super vegetables 


Long overlooked in parts of Africa, indigenous greens are now capturing 
attention for their nutritional and environmental benefits. 


ne lunchtime in early March, tables 
QO: Nairobi’s KOsewe restaurant are 

packed. The waiting staff run back and 
forth from the kitchen, bringing out steaming 
plates of deep-green African nightshade, vibrant 
amaranth stew and the sautéed leaves of cow- 
peas. The restaurant is known as the best place 
to come for a helping of Kenya’s traditional leafy 
green vegetables, which are increasingly show- 
ing up on menus across the city. 

Just a few years ago, many of those plates 
would have been filled with staples such as 
collard greens or kale — which were intro- 
duced to Africa from Europe a little over a 
century ago. In Nairobi, indigenous vegetables 
were once sold almost exclusively at hard-to- 
find specialized markets; and although these 
plants have been favoured by some rural pop- 
ulations in Africa, they were largely ignored 
by seed companies and researchers, so they 
lagged behind commercial crops in terms of 
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productivity and sometimes quality. 

Now, indigenous vegetables are in vogue. 
They fill shelves at large supermarkets even 
in Nairobi, and seed companies are breeding 
more of the traditional varieties every year. 
Kenyan farmers increased the area planted 
with such greens by 25% between 2011 and 
2013. As people throughout East Africa have 
recognized the vegetables’ benefits, demand 
for the crops has boomed. 

This is welcome news for agricultural 
researchers and nutritional experts, who argue 
that indigenous vegetables have a host of desir- 
able traits: many of them are richer in protein, 
vitamins, iron and other nutrients than pop- 
ular non-native crops such as kale, and they 
are better able to endure droughts and pests. 
This makes the traditional varieties a potent 
weapon against dietary deficiencies. “In Africa, 
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malnutrition is such a problem. We want to see 
indigenous vegetables play a role,” says Mary 
Abukutsa-Onyango, a horticultural researcher 
at Jomo Kenyatta University of Agriculture 
and Technology in Juja, Kenya, who is a major 
proponent of the crops. 

Scientists in Africa and elsewhere are now 
ramping up studies of indigenous vegetables 
to tap their health benefits and improve them 
through breeding experiments. The hope is 
that such efforts can make traditional varieties 
even more popular with farmers and consum- 
ers. But that carries its own risk: as indigenous 
vegetables become more widespread, research- 
ers seeking faster-growing crops may inadvert- 
ently breed out disease resistance or some of 
the other beneficial 
traits that made these 
plants so desirable in 
the first place. 

“It is important 


Women sell African 
nightshade and other 
green vegetables ata 
market in Nairobi. 


that when we promote a specific crop, that we 
try to come up with different varieties,’ says 
Andreas Ebert, gene-bank manager at the 
World Vegetable Center (AVRDC), an agricul- 
tural-research organization based in Shanhua, 
Taiwan. If the increasing popularity of these 
vegetables limits choices, he says, “the major 
is benefits we are currently seeing will be lost”. 
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PROTEIN FROM PLANTS 

For Abukutsa, indigenous vegetables bring 
back memories of her childhood. Cow’s milk, 
eggs and some fish made her ill, so doctors 
advised her to avoid all animal protein. Instead, 
the women in her family made tasty dishes out 
of the green vegetables that grew like weeds 
around her house. Her mother often cooked 
the teardrop-shaped leaves of African night- 
shade (Solanum scabrum), as well as dishes of 
slimy jute mallow (Corchorus olitorius) and 
the greens of cowpeas, known elsewhere as 
black-eyed peas (Vigna unguiculata). One 
grandmother always cooked pumpkin leaves 
(Cucurbita moschata) with peanut or sesame 
paste. Abukutsa relished them all and ate the 
greens with ugali, a polenta-like dish common 
in East Africa. 

She chose to pursue a career in agriculture 
because she wanted to “unravel the potential 
hidden in African indigenous vegetables’, she 
says. Now, she is considered a leader across 
Africa, and increasingly around the world, ina 
robust, rapidly growing field. “She’s almost like 
the mother of indigenous vegetables in Kenya,’ 
says Jane Ambuko, head of horticulture at the 
University of Nairobi. 

Abukutsa started out in the early 1990s, 
surveying and collecting Kenya's indigenous 
plants to investigate the viability of the seeds 
that farmers were using. In the decades since, 
she has come to focus mainly on the vegetables’ 
nutritional properties. 

Today, she is far from alone. The AVRDC 
has a dedicated research and breeding pro- 
gramme at its office in Arusha, Tanzania, and 
the Kenya Agricultural and Livestock Research 
Organization in Nairobi does similar work. 
Other health and agriculture organizations 
in both East and West Africa focus on boost- 
ing consumer use and improving the viability 
and yield of these crops. That fits into a global 
trend emphasizing bioregional foods — using 
crops that are well adapted for a given climate 
and environment, rather than foreign plants 
that tend to be less nutritious and require extra 
water or fertilizers. 

Most of the indigenous vegetables being 
studied in East Africa are leafy greens, almost 
all deep green in colour and often fairly bitter. 
Kenyans especially love African nightshade 
and amaranth leaves (Amaranthus sp.). Spider 
plant (Cleome gynandra), one of Abukutsa’s 
favourites for its sour taste, grows wild in East 
Africa as well as South Asia. Jute mallow has a 
texture that people love or hate. It turns slimy 
when cooked — much like okra. Ebert says 


that moringa (Moringa oleifera) is not only one 
of the most healthful of the indigenous veg- 
etables — both nutritionally and medicinally 
— but it is also common in many countries 
around the world. 

Research by Abukutsa and others shows 
that amaranth greens, spider plant and African 
nightshade pack substantial amounts of pro- 
tein and iron — in many cases, more than kale 
and cabbage’. These vegetables are generally 
rich in calcium and folate as well as vitamins 
A, Cand E (ref. 2). 


Researchers 
risk eliminating 
the traits that 
make indigenous 
vegetables so 
desirable in the 
first place. 


In recent years, Abukutsa has been studying 
how to maximize nutritional benefits using dif- 
ferent cooking methods. Compared with raw 
vegetables, boiled and fried greens contain 
much more usable iron’ and could help to com- 
bat the high rates of anaemia in parts of East 
Africa. They can also be important sources of 
protein, she says. “Some people just live on veg- 
etables, and they cannot maybe afford meat.” 

Abukutsa is currently studying the anti- 
oxidant activity of indigenous vegetables, as 
well as how resilient they are to the effects of 
climate change. Most of the traditional vari- 
eties are ready for harvest much faster than 
non-native crops, so they could be promis- 
ing options if the rainy seasons become more 
erratic — one of the predicted outcomes of 
global warming. Slenderleaf (Crotolaria sp.) 
is particularly hardy during drought because 
it quickly establishes its taproot. “If we have 
a short rain because of climate change, it can 
survive,” she says. She is working with other 
research partners to select vegetables with 
increased tolerance for variations in rainfall 
and temperature. 

Early on, Abukutsa recognized that she 
needed to do more to convince people to add 
indigenous vegetables to their diets. Since 
around 2000, she has led public education 
campaigns and worked with restaurants and 
supermarkets around Kenya to find out what 

they would need to start 
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Unlike larger leafy vegetables such as kale, 
many of the indigenous varieties have small 
leaves that must be separated from their stems 
individually before cooking — a laborious 
process. Recipes are often vegetable-specific; 
spider plant can be cooked with sour milk, for 
example, but cowpea leaves go better with soya 
bean or peanut paste. Although older genera- 
tions and some rural populations know what 
to do with nearly any local vegetable, much of 
the region’s traditional cooking knowledge has 
been lost. So Abukutsa got to work on collect- 
ing and testing recipes to maximize the amount 
of iron and other nutrients the dishes contain. 
KOsewe was one of the first restaurants to take 
an active interest, and others soon followed. 
For Abukutsa, indigenous vegetables are not 
just a research subject — they remain a central 
part of her own diet. “Today at lunchtime, I 
ate pumpkin leaves and nightshade,” she says. 
The vegetables’ new-found popularity 
is spreading throughout East Africa. At a 
bustling market in Arusha, a young woman 
wearing a light-blue headscarf shops for 
sweet-potato leaves (Ipomoea batatas), known 
locally as matembele, which have a reputation 
for improving the blood. She buys them from 
an elderly woman who sells almost exclu- 
sively indigenous vegetables under a large red 
umbrella that protects her stock from the after- 
noon sun. She says her sales of such plants have 
climbed substantially over the past five years. 


GLOBAL APPEAL 

Green vegetables are not the only indigenous 
crops attracting researchers’ attention. In the 
1990s, the US National Research Council 
(NRC) in Washington DC convened a panel 
to examine the potential of Africa's ‘lost crops, 
including grains, fruits and vegetables. Chaired 
by renowned agricultural researcher Norman 
Borlaug, the panel concluded that native plants 
held tremendous potential for improving food 
security and nutritional intake across Africa, 
and should be a greater focus for researchers’. 
Today, the World Agroforestry Centre in Nai- 
robi is studying a range of Africa's more than 
3,000 indigenous fruit species, and finding that 
they are generally more nutritious, drought- 
tolerant and pest- and disease-resistant than 
their exotic counterparts. 

But vegetables have gained the most notice, 
both in the marketplace and among research- 
ers. Raymond Vodouhe, a plant breeder and 
geneticist with Bioversity International in 
Cotonou, Benin, says that his team’s work in 
West Africa has focused on domesticating wild 
vegetables. The hardy wild plants help African 
families to get through periods of drought or 
crop failure, but are threatened by deforestation 
and other types of land-clearing. By domesti- 
cating them, researchers can give farmers 
more-reliable access to indigenous vegetables 
so that they can better endure lean times. 

The AVRDC is doing active research on 
native species in Asia and Oceania, as well 
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as Africa. “A rich diversity of indigenous 
vegetable species exists throughout these 
areas,’ says Ebert, pointing to okra and African 
eggplant (Solanum aethiopicum) in Mali, bit- 
ter gourd (Momordica charantia) and Malabar 
spinach (Basella alba) in India, and slippery 
cabbage (Abelmoschus manihot) in the Pacific 
Islands. “The challenge we face is selecting 
which indigenous vegetable species to study 
— with more than 2,000 plants that can be 
considered and consumed as vegetables, and 
very limited research funds, it’s a tough choice.” 
Less than 10% of the AVRDC’s roughly US$20- 
million annual budget goes to studying indig- 
enous vegetables, he says. 

A main focus has been basic problems such 
as difficulties with germination and a lack of 
information about how best to store seeds. 
Indigenous vegetables are not up to modern 
farming standards for characteristics such as 
uniformity of seeds and yield, so there is a lot 
of catching up to do. 

But efforts to improve indigenous vegetables 
could come at a cost, say researchers. If breed- 
ers focus only on increasing yields, they could 
accidentally eliminate nutritional benefits. 
And if farmers seek to drive up production 
through monocrop agriculture — planting 
just one crop — they risk losing some of the 
qualities that make these vegetables such a 
draw. Plots with single crops, for example, face 
higher risks of being completely wiped out by 
insects or diseases. 

At the AVRDC’s office in Arusha in late 
February, vegetable breeder Fekadu Dinssa 
walks through a screened enclosure filled with 
plants used for breeding. He surveys a table 
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Nightshade and other indigenous vegetables helped to sustain Mary Abukutsa-Onyango when she was a child. She went on to pioneer research in these crops. 
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covered with starter trays of little amaranth 
plants from 57 breeding lines. In one tray, the 
plants are twice as tall as their neighbours, but 
their pale colour will not be popular in the 
market, he says. Dinssa wants to breed the 
fast-growing trait into other lines to develop a 
new type of commercially viable amaranth. It 
is a trial-and-error process that can take years. 


STRENGTH IN DIVERSITY 

As indigenous vegetables are planted in greater 
numbers, it will be a challenge to prevent less- 
common varieties from disappearing, say 
researchers. That could threaten the crops’ 
resilience, because different varieties can carry 
separate genes for resistance to pathogens and 
pests. Loss of diversity could also limit the veg- 
etables’ appeal. In Kenya, for example, coastal 
communities tend to like giant African night- 
shade, whereas western communities prefer 
a variety with smaller leaves that has a much 
more bitter taste. 

Some narrowing of choices has already 
happened. Simlaw Seeds in Nairobi, a divi- 
sion of Kenya Seed Company, sells only a cou- 
ple of varieties each of amaranth and African 
nightshade, chosen because they are the most 
popular at the national level. “Of course it’s 
a concern, because practically speaking, we 
can’t promote them all,” says Abukutsa. She 
and other researchers compromise by pro- 
moting certain types while trying to preserve 
the full diversity in gene banks in Kenya and 
at the AVRDC. The researchers also encourage 
communities to continue growing the varieties 
they have traditionally favoured. 

Calestous Juma, director of the Science, 


© 2015 Macmillan Publishers Limited. All rights reserved 


Technology, and Globalization Project at Har- 
vard University in Cambridge, Massachusetts, 
sees these efforts as crucial. And with advances 
in genomics, he says, researchers should seek 
ways to improve indigenous crops — by length- 
ening their shelf life, for example — and to use 
them in breeding other plants. “They may have 
traits that may be useful for other crops? 

Juma, who served on the NRC’s lost-crops 
panel, urges more agricultural research centres 
in Africa to study these vegetables. The work 
that Abukutsa and her colleagues are doing, 
he says, “should be done at every university”. 

On a hot Wednesday morning in March, 
Abukutsa walks around the university campus 
to survey some of her students’ work. One has 
spread amaranth leaves in a wooden box in the 
sunlight to test how drying will alter the plants’ 
nutritional profile. Abukutsa stops to talk to 
another student standing amid dozens of rows 
of recently sprouted African nightshade plants, 
part of an experiment on their genetic diver- 
sity. “We've come so far,’ Abukutsa says, “but 
there's still so much to be done.” m 


Rachel Cernansky is a freelance writer in 
Denver, Colorado. 
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Pull together for fusion 


ITER director-general Bernard Bigot explains how he will strengthen leadership and 
management to refocus the project’s aim of harnessing nuclear fusion. 


r ten years ago this month, China, 
the European Union, Japan, South 
Korea, Russia and the United States 

agreed on the location for the world’s larg- 

est nuclear-fusion experiment: ITER, the 

International Thermonuclear Experimen- 

tal Reactor, which they had decided to 

build jointly. India joined six months later. 

The project’s aim is to fuse two isotopes of 

hydrogen — tritium and deuterium — to 

deliver a powerful, clean source of electric- 
ity. This requires the containment of plasma 
at temperatures ten times higher than the 

Suns core. 

Roughly €4-billion (US$4.4-billion) worth 
of construction contracts and €3 billion in 
manufacturing contracts worldwide are 
under way. The first large components are 


being delivered to the site at St-Paul-lez- 
Durance in southern France for assembly. 

The project has been plagued by delays 
and difficulties. The seven ITER members 
are designing and manufacturing key com- 
ponents. When deadlines or standards are 
not met, the knock-on effects across the 
whole project can be dire. Late contracts 
for tools have kept one of the largest build- 
ings — in which ring-shaped magnets up to 
24 metres in diameter will be manufactured 
— inactive since its completion in Decem- 
ber 2011. When problems arise, bickering 
ensues as to who should foot the bill. 

I have been a privileged observer from 
the start, as the high representative for 
ITER in the host country, France. Because 
France itself is not a formal member of 
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ITER — it contributes to the European 
Union budget for the project and to some 
basic site infrastructures — I, like many 
others, could only witness with frustration 
the slipping of the schedule despite the best 
efforts of the more than 2,000 dedicated 
people working on ITER. 

Since becoming director-general of the 
ITER Organization, which manages the 
project, in March, I have realized that ITER’s 
main problem has been the lack of a clearly 
defined authority to oversee the entire pro- 
ject. Having someone firmly in the driving 
seat, with the power to take decisions, is the 
key to success in any project. I have learned 
this over the course of my career — through 
building an innovative higher-education 
institute from scratch (Ecole Normale > 
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ILLUSTRATION BY DAVID PARKINS 


> Supérieure de Lyon) and as head of the 
French Alternative Energies and Atomic 
Energy Commission for 12 years. 

Here, I set out my vision for ITER. The 
project must overcome its organizational 
problems so that it can deliver on its promise 
of taking a firm step towards harnessing an 
unlimited, continuous, safe and clean source 
of energy. These lessons apply to any major 
international collaboration. 


AROCKY TRANSITION 

Since construction began on ITER five years 
ago, it has become increasingly apparent that 
the project's management structure is poorly 
adapted to the challenge of building a large, 
complex research facility. 

Take the 8,000-tonne ITER vacuum vessel, 
the doughnut-shaped central component of 
the ‘tokamak reactor that houses the fusion 
reactions. Seven of its nine sectors are to be 
manufactured in Europe and two in South 
Korea, with each region or country taking 
responsibility for how they are sourced. Hav- 
ing two contractors is a risk, because each has 
its own manufacturing techniques; duplicat- 
ing the processes that validate the quality and 
function of components, such as fabricating 
mock-ups, adds to the cost; and the tolerance 
margins that each contractor has adopted 
differ. Yet the ITER Organization is respon- 
sible for assembling the final vessel. 

Any modification has a cascading impact 
on other components. This has generated an 
almost endless to-and-fro between the ITER 


Organization, procuring member countries 
and suppliers. This situation has already cost 
ITER tens of millions of euros. 

People know there is a problem. A 2013 
management-assessment report described 
the decision-making process at the ITER 
Organization as “ill-defined and poorly 
implemented”. The management structure 
has proved incapable of solving issues and 
responding to the project's needs, so accu- 
mulating technical difficulties have led to 
stalemates, misunder- 


standings and tension “Is it possible 
between staffaround torealize the 
the world. These Promethean 
problems stem from dream of 
how the organization bringing the 
was setupthroughan fire of the 
international treatyin Sundown to 
2007 (see ‘The Pro- Farth?” 
methean dream). 


First, deputy director-generals from each 
member country or region were given respon- 
sibility for one large technical or administra- 
tive department of the ITER Organization. 
These managers also acted as official repre- 
sentatives for their nation or nations. 

Second, the procurement of components, 
systems and buildings is split among the 
member states so that each could gain expe- 
rience. The work is assigned according to the 
industrial capacities of members and a cost- 
sharing scheme that allocates 45.5% to the 
European Union (as the host) and just over 
9% to each of the others. Each member has a 


NUCLEAR FUSION 


The Promethean dream 


ITER has been political from the start. 

At a meeting in Geneva, Switzerland, 

in November 1985, then US President 
Ronald Reagan and leader of the Soviet 
Union Mikhail Gorbachev proposed an 
international effort to develop fusion 
energy “as an inexhaustible source of 
energy for the benefit of mankind”. Easing 
geopolitical tensions at the height of the 
cold war was one of their motives. ITER 
engaged political leaders ina common 
venture for the good of all. It gave scientists 
and engineers around the world an 
opportunity to acquire knowledge and 
expertise to lead fusion from research to 
the commercial phase. 

Fusion energy — produced by the 
melding of the nuclei of light atoms into 
heavier ones — powers the Sun and 
stars. Since discovering this in the 1920s, 
scientists have hoped to recreate fusion 
reactions and reap the energy produced 
to generate electricity. ITER’s completion 
will answer the question that has obsessed 
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three generations of physicists and 
engineers: is it technologically possible to 
realize the Promethean dream of bringing 
the fire of the Sun down to Earth? 


Because of the turbulence that arises in 


a confined, magnetized plasma, a fusion 
machine aiming for a significant energy 
gain must be large. The ring-shaped ITER 
reactor will be about 29 metres high and 
29 metres in diameter, housed in a building 
that will be comparable in size to the Arc de 
Triomphe in Paris. It requires huge human 
and financial investment. No nation has the 
resources to go it alone. 


Twenty years on, ITER has seven partners 


— China, the European Union, India, 
Japan, South Korea, Russia and the United 
States — and is managed by the ITER 
Organization, which was established on 
24 October 2007 by an intergovernmental 
treaty. The design is settled: ITER will be a 
‘tokamak’ (from the Russian acronym for 
‘toroidal chamber with magnetic coils’). 
Construction is under way in France. 
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procurement centre, called a domestic agency, 
that is legally and administratively independ- 
ent from the central ITER Organization. 

The organization is responsible for 
validating the design of the facility; compli- 
ance with safety regulations; coordination 
of manufacturing and quality control of the 
numerous components; their on-site assem- 
bly; and later, the operation of the facility. 

Paperwork abounds. For each work pack- 
age, the organization signs a procurement 
arrangement with the relevant domestic 
agency that details all technical specifica- 
tions and management requirements. The 
domestic agency then launches a call for 
tender to select a company or consortium 
to do the work. 

Such a system has benefits: procurements 
are shared widely, industries in member 
states develop, spin-offs are generated, jobs 
are created and specialists trained. Intel- 
lectual property generated by the project is 
shared. But it has become ever more obvious 
— as successive reports have pointed out — 
that the costs outweigh the benefits. 


TEAM BUILDING 

I accepted the job of director-general on 
the condition that the position was newly 
invested with full authority over the whole 
project. Authority and a radical redefi- 
nition of how the organization interacts 
with the domestic agencies are at the core 
of the action plan that I submitted to the 
ITER Council in January, before my formal 
appointment. 

The domestic agencies will retain their 
distinct legal identity. But they will be inte- 
grated functionally and put on an equal foot- 
ing with the departments in what we now 
call the ITER Organization Central Team, 
based in St-Paul-lez-Durance. 

A new executive project board brings 
together the managers of the central team and 
the domestic agencies at least once a month, 
in person or by video conference. Disputes 
can be settled and decisions taken swiftly. 

Technical issues — from construction to 
radioprotection and cryogenics — are han- 
dled by project teams of 20 to 50 people, 
depending on the scope. They comprise 
staff from the central team and domestic 
agencies on the basis of technical need, 
professional skills and experience. When 
necessary, representatives of contracting 
industries participate. 

Itis too late and costly to reverse decisions 
that have already been made — such as how 
the tokamak vacuum vessel is fabricated. 
Problems must be solved downstream; in 
April, the executive project board formed 
a joint ITER Organization and domestic 
agency project team to anticipate and over- 
come integration and assembly issues. Had 
this decision been taken earlier it would have 
saved time, money and frustration. 


ITER ORGANIZATION 


The ITER Organization and domestic 
agencies together employ 2,000 people. 
Changing how ITER is managed will alter 
its culture. I aim to foster an atmosphere in 
which each party or individual feels per- 
sonally responsible for the whole project, 
not just their area of competence. One of 
my first actions after becoming director 
was to address the staff of each domestic 
agency. The most striking moment was in 
a video session with all four Asian agen- 
cies. For the first time, colleagues in Japan, 
India, South Korea and China saw the 
faces of their counterparts, changing the 
dynamic towards a shared global ambition. 

I am also implementing a new type of 
mobility throughout the project. This will 
enable appropriate domestic-agency staffto 
be temporarily seconded to the ITER site, or 
central-team staff to be assigned to domestic 
agencies. 

The ITER Council has agreed to this new 
organization. I am grateful for their strong 
support and the progress already made 
in solving technical issues and improving 
communication. 


DISCRETIONARY FUND 
There is still much more to do. Authority 
requires the financial means to exercise it. 
I have asked for the creation of a reserve 
fund, to be put at my disposal. Each domes- 
tic agency will contribute, allowing me to 
take quick and efficient decisions to address 
issues as they arise. Terms of reference will be 
presented to the council in June for approval. 
The money will be drawn from the contribu- 
tions of the ITER members in proportion to 
the amount they pay in. 

In my experience of industrial projects, a 
reserve fund must comprise about 20% of 
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fabrication costs over the duration of con- 
struction. In my view, it was naive not to 
establish such a fund much earlier in ITER’s 
history. 

Before the end of this year, Iam expected 
to submit, along with all stakeholders, an 
updated, robust and reliable schedule to the 
ITER Council, and a cost and risk analysis. 
With renewed management and a stream- 
lined organization, we are now ready to pre- 
pare for the assembly and commissioning 
phase, the step before fusion switches on. 

Further delays and costs are inevitable. 


Construction at St-Paul-lez-Durance, France, site of the ITER nuclear-fusion experiment. 


ITER will meet these challenges if it has the 
unanimous political support of the seven 
members, on the basis of the long-term value 
of fusion technology. 

All of us at ITER have a huge, historic 
responsibility. The project may be the last 
chance we have this century to demonstrate 
that fusion is manageable. = 


Bernard Bigot is director-general of the 
ITER Organization, St-Paul-lez-Durance, 
France. 

e-mail: bernard. bigot@iter.org 


Use mouse biobanks 
or lose them. 


Now that genetic engineering of mice is so easy, centralized repositories are 
essential, argue Kent Lloyd and colleagues. 


r | Jens of thousands of genetically 
engineered mice have been bred to 
probe human biology and disease. 

Their numbers are poised to mushroom. 

New genome-editing technologies such 

as CRISPR/Cas9 mean that making an 

animal that carries several customized 
mutations can be done in a matter of 
months, rather than years. Investigators 


who would not previously have considered 
making mutant mice are now doing so. 
But laboratories that can make genetically 
modified mice are often unable to maintain 
them. Progeny frequently carry pathogens, 
lose carefully designed mutations or have 
other characteristics that confound experi- 
ments. So the mice that a researcher might 
dutifully ship to a colleague can be very 
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different from those described in a paper. 
In 2013, the Mutant Mouse Resource and 
Research Centers (MMRRCs), a consor- 
tium of the US National Institutes of Health, 
found that 32 of around 200 mouse lines 
deposited with them from individual labs 
did not match researchers’ descriptions. It is 
no wonder that many preclinical studies per- 
formed using mice are not reproducible’. > 
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>  Asadministrators of publicly funded 
animal repositories charged with preserv- 
ing and distributing genetically engineered 
mouse lines, we routinely encounter — and 
correct — problems introduced by inap- 
propriate breeding, animal husbandry and 
quality control. We worry that the explo- 
sion of new mouse models could create a 
surge of wasted effort and irreproducible 
results. Better use of repositories could 
avoid this problem. 


IN THE BANK 
For the past 16 years, the MMRRCs have 
maintained unique mouse models deposited 
by individual scientists. The collections 
encompass around 4,600 specific mutations 
in mice, and tens of thousands of mutations in 
frozen embryonic stem cells that can be used 
to generate mice. Last year, the MMRRCs 
and the Jackson Laboratory (JAX), a US 
non-profit biomedical-research organization, 
together distributed more than 200,000 live 
engineered mice, as well as frozen embryos 
and sperm representing hundreds of mutant 
lines. Australia, Europe and Japan also have 
government-funded repositories. 

Mouse lines created by individual labs 
are often lost because of lack of interest or 
expertise. Commercial suppliers maintain 
only those lines that are in high demand. 
Making mouse lines publicly available 
from repositories renders these resources 
more accessible and eliminates costly, 
redundant efforts. It also relieves scientists 
of having to house animals and manage 
their breeding. 

But fewer than half of the roughly 
43,000 specific mutations listed in Mouse 
Genome Informatics, an international 
database of engineered mice, are listed as 
available from repositories. This is despite 
the fact that researchers funded by the 
US government are strongly encouraged 
to deposit mice for public distribution. A 
wide-ranging 2005 survey conducted by 
the NIH to investigate the extent of the 
problem found that, of 4,848 published 
mouse lines, only 12% were readily availa- 
ble from repositories. This forced scientists 
to rebuild mice; 2,655 had been remade 
at least once, and 702 had been remade 
independently more than three times (see 
‘Remaking mice’). The survey also 
spurred the Knockout Mouse 
Project (KOMP), which 
with international 


A mouse bred to be 
diabetic, obese and 
hyperglycaemic. 
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REMAKING MICE 


When engineered animals are unavailable, 
researchers make them again. The most 
recent comprehensive survey, carried out in 
2005, found that researchers had made 
thousands of mouse lines more than once, 
wasting animals, time and money. 
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partners has made around 15,000 knockout 
alleles in embryonic stem cells, all depos- 
ited in repositories, including the KOMP 
Repository. 

Since the survey, journals and funders 
have become more strict about requiring 
depositions. Yet we estimate that fewer than 
20% of mouse lines are submitted. The rate 
is likely to fall as more scientists are able to 
engineer lines. 


QUALITY CONTROL 

Most researchers who use mice are experts 
in their fields rather than in mouse genetics, 
husbandry or pathology. Reagents remain 
relatively constant; a mouse is a living, 
breeding creature. Change is the default, 
and change over generations must be under- 

stood, monitored and managed. 
Ordering a breeding pair of mice from 
a repository typically takes 3-5 weeks and 
costs US$400-600, more if mice must be 
created from frozen stocks. (Engineering 
a mouse from scratch can cost upwards of 
$20,000.) Many researchers prefer to get 
animals straight from a colleague. Although 
sharing is laudable, mice obtained from 
research labs rarely go through the rig- 
orous checks that are standard 
practice in repositories. 
Receiving labs are risk- 
ing the reliability of 
subsequent experiments 
— and perhaps even the 

health of their vivaria. 

Repositories ensure 
the quality and welfare 
of distributed animals 
and supply expertise 
to guide reliable studies. 
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This means that researchers learn more from 
the animal experiments they conduct. They 
address the problem of improperly identified 
animals in several ways. By the time results 
from an engineered mouse line are published 
ina paper, the line has probably bred through 
several generations and undergone genetic 
drift. Repositories can accept a mouse line 
on article submission (and even hold off 
distribution until publication) and maintain 

animals closer to the original description. 
What is more, repositories routinely 
analyse animals’ genomes before mak- 
ing them available and so catch mistakes. 
Sometimes researchers overlook mutations 
that have been engineered into a mouse line, 
which can alter the animals traits or corrupt 
attempts at appropriate breeding. In 2013, 
genetic tests on 416 mutant mouse lines sub- 
mitted to JAX and the MMRRCs found that 
15% carried mutations for traits other than 
that specified, or contained genetic markers 
used to track mouse breeding not intended 
to be part of the line. The most frequent mix- 
up is essentially a typo: a common strain 
annotated as C57BL/6] is instead another 
called C57BL/6N. Although the pups look 
identical, the mice are very different. The 6N 
mice quickly develop bad eyesight, and 6] 
mice are susceptible to diabetes and obesity. 
Such traits can cause results to be misinter- 
preted and experiments to be irreproducible. 
Researchers might also treat a line as 
breeding pure (with no mixing of genetic 
backgrounds) when it does not. All 
MMRECs have received submissions of 
engineered lines that 


“Repositories contained a mixture 
ensure the of mice carrying 
quality and the mutation and 
welfare of ‘wild-type’ mice. We 
distributed have also encoun- 


tered many instances 
in which a genetic 
marker (such as a fluorescent protein) was 
decoupled from the mutation it was sup- 
posed to identify. No surprise, then, that 
researchers who receive mice from col- 
leagues can conduct several rounds of 
experiments, only to find that they have been 
studying mice that lack the desired mutation. 

Another underappreciated source of 
variability is microbes. Identical mutations 
ina gene active in T cells made at two institu- 
tions revealed similarities on the molecular 
and cellular levels, but profound differences 
in the animals. At one institution, mice con- 
sistently developed prolapsed rectums and 
died two months after birth. Careful inves- 
tigation revealed that this was caused by a 
stealth outbreak of the bacterium Helicobacter 
hepaticus*. Repositories control for pathogens 
through frequent monitoring — and the abil- 
ity to revive the strain under germ-free condi- 
tions from frozen embryos or gametes. 

The microbiome (the collective DNA 


animals.” 


SOURCE: NIH/THE JACKSON LABORATORY 
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Littermates are often a mixture of mice carrying a required mutation and ‘wild type’. 


of microbes residing in the gut), which 
can vary for mice at different institutions 
even when they are fed identical diets, also 
causes surprising differences. Reposito- 
ries are beginning to use DNA sequenc- 
ing to define common variables such as 
diet, housing and other factors that may 
modulate microbiota’. This could reveal 
the effects of these variables on a variety of 
mouse traits, and make animal studies con- 
ducted by collaborating investigators more 
efficient and reproducible. 

Finally, repositories can help to pin down 
unexpected causes of mouse traits, making 
the mice more useful to researchers. For 
example, a strong difference between two 
strains’ responses to cocaine and metham- 
phetamine was recently mapped to a site 
in a single gene that differed between the 
strains*. At the MMRRC at the University of 
North Carolina at Chapel Hill, work to sort 
out crosses between mouse lines generated a 
variety of traits, including a line of mice that 
developed severe inflammation in the gut. 
The line has now been distributed to several 
organizations as a model for the human dis- 
ease known as spontaneous colitis”. 


FOUR STEPS FORWARD 

If these benefits are to accrue, researchers 
must deposit their mice in repositories. Cur- 
rently they may not for three reasons: they 
are unaware that repositories exist; they mis- 
takenly think that they must pay for submis- 
sion; or they want greater control over when 
and how their lines are distributed. Scientists 
should be better global citizens, and funders 
and journals must be more diligent in creating 
and enforcing requirements for deposition. 


Next, scientists must use mice from 
repositories. The catalogue number and 
other documentation that repositories sup- 
ply will ensure clear tracking of the mice, 
and the quality control that repositories 
routinely perform will ensure that descrip- 
tions match the actual mouse. The ARRIVE 
(Animals in Research: Reporting In Vivo 
Experiments) guidelines®, increasingly fol- 
lowed by publications that report animal 
research, should be amended to encourage 
acquisition from repositories. Mice obtained 
from non-specialists should undergo docu- 
mented quality control — from genotyping 
to pathogen monitoring. 

Third, repositories must work together to 
enhance their services. They should organize 
themselves into an integrated global network 
to share best practices, harmonize protocols 
and procedures, and innovate. Goals should 
include the development of certified quality- 
control practices and streamlining institu- 
tional transfer agreements to take in new 
mouse strains and to guarantee better dis- 
tribution, particularly across international 
borders. 

Continued investment is all the more 
important as researchers are called on to 
meet government mandates that may require 
individual studies to include more animals. 
Recent examples include Research Councils 
UK’s requirement for researchers to statisti- 
cally validate the numbers of mice used and 
the NIH’s mandate to study both male and 
female mice’. Meanwhile, more mouse lines 
are being made by individual labs, and the 
International Mouse Phenotyping Consor- 
tium is set to complete more than 20,000 
mouse models by 2021, encompassing 
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virtually the entire mouse genome. 

Without sufficient investment, we fear 
a vicious cycle in which repositories are 
unable to cope with increasing demands 
and become less able to serve the scientific 
community, keeping fewer live mice ready 
for distribution. Worse, they will close. (At 
least one already has.) 

Like money in the bank, repositories keep 
mouse models safe, secure and available for 
withdrawal. Just as a bank makes returns on 
investments, repositories add scientific value 
and utility to deposited mouse lines: they 
increase reliability through curation, pres- 
ervation, genetic quality control, protection 
from pathogens and more. = 


Kent Lloyd is director of the Mutant Mouse 
Resource and Research Center (MMRRC) 

at the University of California, Davis, 
California, USA. Craig Franklin is director 
of the MMRRC at the University of Missouri, 
Columbia, Missouri, USA. Cat Lutz is 
director of the MMRRC at the Jackson 
Laboratory in Bar Harbor, Maine, USA. 
Terry Magnuson is director of the MMRRC 
at the University of North Carolina at Chapel 
Hill, North Carolina, USA. 

e-mail: kclloyd@ucdavis.edu 
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Piltdown Man (held by Alvan Marston, who helped to debunk the fraud), misled some palaeontologists. 


HUMAN EVOLUTION 


How we misread 
our own story 


William Davies ponders a chronicle unwinding the 
twisted strands of thinking on human evolution. 


palaeoanthropologist Ian Tattersall out- 

lines the history of thought on human 
evolution clearly and insightfully, allowing 
readers to make up their own minds about 
the motives and actions of key figures. The 
field, he reveals, has both benefited and 
suffered from involving other disciplines. 
Concepts rejected elsewhere have been 
applied to palaeoanthropology, and these 
have reinforced the fallacious idea that 
human evolution is distinct from that of 
other life forms. The history of the field 
reveals a divide between those who prefer 
linear sequences of speciation, and those that 
prefer a many-branched tree — the ‘umpers’ 
and ‘splitters: 

From the mid-nineteenth century, the 
study of human origins bristled with self- 
appointed experts. The books title refers to 
acase in point: in the late 1850s, physiologist 


I n The Strange Case of the Rickety Cossack, 
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August Franz Mayer identified Neanderthal 
remains from northern Germany as belong- 
ing to a Cossack soldier with rickets who had 
died in 1814and somehow become buried in 
2 metres of fossiliferous deposits. Other anat- 
omists, including Thomas Henry Huxley, 
were happy to agree that the individual ana- 
tomical features of Neanderthals fell within 
the range of variation in Homo sapiens. 

The lack of connection between those 
who recovered the fossilized and archaeo- 
logical remains of early hominins, and those 
who sought to interpret them, is perhaps 
the most striking feature of the first 60-70 
years of palaeoanthropology. It ensured 
that wider archaeological and ecological 
contexts were all but ignored in the appli- 
cation of predetermined (and untested) 
theories. So, for early-twentieth-century 
anatomist Marcellin Boule, fossils such as 
the Javan Homo erectus and the European 
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Neanderthals repre- = 
sented extinguished 
side-branches on a 
developmental tree 
that led to H. sapiens 
by other means. 

The archaeologi- 
cal situation was not 
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much better. Gabriel : 
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nouncements in the ed aed 
mes that stone- Other Cautionary 
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evidence from Upper 
Palaeolithic sites in 
France that had been correctly sequenced by 
Edouard Lartet years before. The Piltdown 
Man fraud of 1912 was designed to appeal 
to prejudices in favour of early development 
of a large brain (and even included a bone 
artefact carved to resemble a cricket bat; 
see C. Stringer Nature 492, 177-179; 2012). 
Anyone who had focused on how the Pilt- 
down remains had been recovered and on 
their context would have been sceptical of 
such predeterminism. 

Tattersall provides a useful discussion of 
the chaotic and idiosyncratic nomenclature 
created in the first half of the twentieth cen- 
tury. Almost every hominin fossil was clas- 
sified as its own species, so people had little 
sense of broader patterns. Enter evolution- 
ary biologist Ernst Mayr. A proponent of 
the evolutionary synthesis of the 1930s and 
1940s, which unified Mendelian genetic 
inheritance with Darwinian natural selec- 
tion, Mayr demanded that palaeoanthro- 
pology be aligned with wider evolutionary 
research. As a result of his address at the 
international meeting “The Origin and Evo- 
lution of Mar’ in 1950, the number of homi- 
nin species was reduced. The postulated 
ancestors of H. sapiens were lumped into a 
single lineage of gradually evolving subspe- 
cies separated by barriers such as oceans and 
extending back at least 2 million years. 

A change in analytic methods was needed 
before this single-species hypothesis could 
be supplanted by branching taxonomies with 
complex patterns of localized speciation, 
extinction and population replacement. In 
the mid-1970s, palaeoanthropologists began 
to use cladistic analyses (which group organ- 
isms on the basis of shared characteristics) to 
evaluate possible links between species. But 
using such methods,Tattersall and palaeon- 
tologist Niles Eldredge found it difficult to 
demonstrate that H. erectus was an ancestor 

of H. sapiens. 
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interspersed with rapid diversification, a 
concept that Eldredge co-developed) has 
been applied to hominin fossils and to suc- 
cessions of archaeological ‘cultures, solely on 
the basis of recovered artefacts. But as Tat- 
tersall points out, views of transitional fos- 
sils and cultures as denoting sudden shifts 
between stable states are poorly theorized. 

The single-species hypothesis has never 
quite disappeared. It has been reinvented 
as multiregional evolution from the 1970s 
onwards, by Milford Wolpoff and others. 
They model the transition to H. sapiens at 
the global scale, positing regional popula- 
tions all effectively occupying the same 
niche wherever they happen to be, and 
interbreeding so that the species develops at 
the same rate everywhere. That would make 
increases in hominin brain size part of a uni- 
versal trend. However, Tattersall identifies 
three separate episodes of relative brain-size 
increase within Homo: in H. erectus in Asia, 
and then, much later, in Neanderthals in 
Europe and H. sapiens in Africa. Regional 
and multiregional approaches still coexist, 
but current evidence tends to support the 
regionalized approach. The niches occu- 
pied by our ancestors and their relatives were 
likely to have been more varied than multi- 
regionalists would believe. There remains 
the question of how much the accuracy and 
precision of dating methods (and the quan- 
tity of data available) condition our discus- 
sions. If we had more and better dates, would 
we have a finer-grained view of change and 
variation? 

The Strange Case of the Rickety Cossack 
is an interesting critical evaluation of how 
palaeoanthropology has developed. Rivalries 
between teams are delineated and used to 
explain how we know what we know. Many 
new hominin species have been identified in 
recent years, but it is not yet clear how they 
are related to us. More work is needed on 
the classification of skeletal material from 
Dmanisi in Georgia, which encompasses 
extraordinary morphological variety, and 
from Flores in Indonesia, where the ‘hobbit’ 
Homo floresiensis was found (see C. Stringer 
Nature 514, 427-429; 2014). 

Some of Tattersall’s assertions will gener- 
ate heated debate — particularly the claim 
that the large-brained Neanderthals were 
empirical artisans, rather than symbolic 
artists. Current archaeological evidence 
indicates that Neanderthals were able 
to innovate, but that these innovations 
may have been kept within small-scale 
social networks. By contrast, the future of 
palaeoanthropology lies in its ability to make 
extensive connections. m 


William Davies is director of the Centre for 
the Archaeology of Human Origins at the 
University of Southampton, UK. 

e-mail: s.w.g.davies@soton.ac.uk 
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Books in brief 


The Great Divide: Unequal Societies and What We Can Do About 
Them 

Joseph E. Stiglitz W. W. NORTON (2015) 

That 1% of the world now owns nearly half of the wealth is weakening 
the global economy. So argues Nobel-prizewinning economist 
Joseph Stiglitz in this collection of writing originally published in 
Vanity Fair and elsewhere. He ranges with searing honesty from 

the deregulation and tax cuts for the rich that spurred the 2008 
meltdown to the ebbing of socio-economic mobility. His solutions to 
the crisis are presented authoritatively as eminently doable — from 
boosting corporate taxes to investing in science and education. 


A Natural History of English Gardening 

Mark Laird YALE UNIVERSITY PRESS (2015) 

In this vast, stunningly illustrated history of gardening in England, 
landscape historian Mark Laird focuses on a fertile moment — the 
mid-seventeenth to early nineteenth centuries. During that time, he 
argues, natural history (the discovery of order in nature) emerged 
from the evolution of the garden (nature’s ordered microcosm). Laird 
marshals climatic events such as the Little Ice Age winter of 1683 
and the drought a century later to contextualize advances in forestry 
and garden design by John Evelyn, and in horticultural science by 
Mary Somerset, Duchess of Beaufort, among other developments. 


Domesticated: Evolution in a Man-Made World 

Richard C. Francis W. W. NORTON (2015) 

We humans evolve side by side with other animals in the process 

of domestication, and in this intriguing study, science journalist 
Richard Francis tracks those changes. As he shows, both natural and 
artificial selection have worked powertully to create diversity in size 
and shape in domesticated animals, notably the dog. Yet “evolution 
is still fundamentally conservative”, he notes: the wolf lingers in the 
chihuahua. Francis presents numerous case studies, from ferrets 
and camels to reindeer and us. Our self-domestication, he avers, has 
driven the cultural dynamism that has made us what we are. 


Elephant Don: The Politics of a Pachyderm Posse 

Caitlin O’Connell UNIVERSITY OF CHICAGO PRESS (2015) 

The jaunty title belies the scholarly weight of Caitlin O’Connell’s study 
on social behaviour in a group of African bull elephants in Namibia’s 
Etosha National Park. O’Connell, who also works on the role of 
vibration in mammal communication, offers a riveting account. We 
see the pachyderms dipping their trunks into the mouth of dominant 
bull Greg; battling or welcoming would-be members; and, when Greg 
disappears, standing tail to tail, facing out as if listening for some 
seismic clue. Full of vivid detail, such as waking up to the “demonic- 
sounding giggling” of hyenas. 


Plankton: Wonders of the Drifting World 

Christian Sardet UNIVERSITY OF CHICAGO PRESS (2015) 

They have vital roles in climate and food chains, but their minuscule 
size means that plankton impinge little on the public consciousness. 
In this beautiful book, marine scientist Christian Sardet shows that 
tiny plankton, not enormous blue whales, are the real stars of the 
ocean. Macro pictures of the huge variety of plankton forms and short 
details of their lives force a reconsideration of our view of them as part 
of an amorphous soup. A celebration of the small, and an unalloyed 
joy. (See the Nature video at go.nature.com/gegecq.) Barbara Kiser 
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Correspondence 


The joys of research 
in retirement 


After retiring some ten years ago 
at the age of 65, I still wanted to 
do some worthwhile research 
(Nature 521, 20-23; 2015). 

Ihad only a chair and a table 
for support. These props came 
courtesy of my former employer, 
along with online access to 
the scientific literature. I was 
originally a researcher in two 
very different fields — surface 
science and nanoparticle-related 
health effects — so I set about 
re-evaluating publications 
in both areas. New ideas 
emerged, sparking successful 
collaborations with former 
colleagues who had the necessary 
equipment to investigate them. 

Ihave written and published 
30 mostly single-author papers 
since retirement. Most notable 
is a 123-page review that took 
me almost 2 years to prepare, 
allowing me to invalidate a 
theory that was more than 
30 years old (K. Wittmaack Surf. 
Sci. Rep. 68, 108-230; 2013). 

I reckon that I have learned more 
per unit time during this phase 
than I did during my ‘active’ 
career. And still I keep going. 

Colleagues with retirement in 
sight should give up the idea that 
science can only be advanced with 
a sizeable research team. Sit down 
and take the literature to pieces, 
then put the puzzle together again 
in light of your newly gained 
insight. Gratifying work awaits. 
Klaus Wittmaack Helmholtz- 
Zentrum Miinchen, Institute of 
Radiation Protection, Germany. 
wittmaack@helmholtz-muenchen.de 


Phosphate mining 
risks atoll culture 


Mataiva atoll in the Pacific Ocean 
has an unusual morphology: its 
central lagoon is divided into 
numerous shallow basins bya 
network of slightly submerged 
coral shoals. This extremely rare 
geological feature, known asa 
reticulated lagoon, is now under 
threat from the global demand for 


phosphate, used in agriculture. 

International companies 
and the government of French 
Polynesia have attempted to 
mine Mataiva’s rich phosphate 
resources since the 1980s, but 
have so far been thwarted by its 
inhabitants. 

Opposition was on the basis of 
the potentially disruptive effects 
of mining on the population's 
identity and on its culture of 
coconut farming and fishing. An 
18-month test of the extraction 
process in 1986 destroyed and 
polluted fish habitats to the extent 
that people reportedly could not 
eat lagoon fish for 10 years. Island 
people also feared the loss of land 
rights, because their livelihoods 
depend on farming and drying 
coconut flesh (copra). 

The government this year 
announced its decision to resume 
phosphate extraction. Resistance 
is now dangerously weakened as 
the atoll’s elders dwindle and the 
younger generation moves away, 
severing the cultural attachment 
to the atoll’s traditional way 
of life. The world’s geological 
patrimony is again at stake. 
Alexandre Magnan Institute for 
Sustainable Development and 
International Relations (IDDRI), 
Sciences Po, Paris, France. 
Virginie Duvat Littoral, 
Environment and Societies Research 
Unit (LIENSs, UMR 7266), 
University of La Rochelle and 
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CNRS, La Rochelle, France. 
alexandre.magnan@iddri.org 


Share surplus 
animal tissue 


Strict regulations govern the 
use of laboratory animals in 
research (see, for example, 
K. Davies Nature 521, 7; 2015), 
but scientists are under 
increasing pressure to justify 
their experiments and address 
public concerns (Nature 520, 
271-272; 2015). Initiatives such 
as SEARCHBreast avoid the need 
to set up further in vivo models 
by using surplus archival tissue 
from previous animal studies. 
SEARCHBreast (for ‘Sharing 
Experimental Animal Resources: 
Coordinating Holdings in 
Breast Cancer’) is a searchable 
platform of tissues, resources 
and information derived from 
animal models of breast cancer 
(www.searchbreast.org). These 
materials can be deployed 
for characterizing tumour 
biomarkers and genetically 
engineered animal models, for 
example, or for investigating 
treatment effects on archived 
human-to-mouse xenografts. 
The ‘SEARCH blueprint 
translates to other diseases: 
for example, ShARM (Shared 
Ageing Research Models; 
www.sharmuk.org) aims to 
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accelerate research on ageing. 
Such resources support official 
‘replacement, refinement 

and reduction’ policies (see 
www.nc3rs.org.uk/the-3rs) and 
save time and money. 

Valerie Speirs* Leeds Institute of 
Cancer and Pathology, University 
of Leeds, UK. 
v.speirs@leeds.ac.uk 

*On behalf of 7 correspondents (see 
go.nature.com/gthlzc for full list). 


Climate advisers 
must be astute 


Oliver Geden suggests that 
scientific advisers on climate 
should resist becoming “political 
entrepreneurs” by making 

their advice more palatable 
(Nature 521, 27-28; 2015). In 
fact, climate advisers need to be 
astute political entrepreneurs if 
they are to present the benefits 
of a policy change without 
exaggerating claims. 

Political pragmatism is not for 
helping policy-makers to justify 
the status quo, but rather for 
presenting persuasive scientific 
evidence alongside other issues 
(D.C. Rose Nature Clim. Change 
4, 1038; 2014). Entrepreneurial 
climate scientists can offer fresh 
solutions to policy-makers, point 
out the improvements their ideas 
would provide and explain how 
they would work in practice. 
These entrepreneurs take the 
concerns of other scientists and 
policy-makers into account, 
build professional networks, 
and use every opportunity to 
maximize political influence 
(M. Mintrom and P. Norman 
Policy Stud. J. 37, 649-667; 2009). 

Optimizing science 
presentation does not mean 
compromising on technical 
rigour or integrity. Climate 
scientists can increase their 
understanding of how policy- 
makers use the evidence they 
provide, as Geden recommends, 
and so deploy it more effectively 
to argue for policy change. 
David C. Rose University of 
Cambridge, UK. 
dcr31@hermes.cam.ac.uk 
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A master lock for deadly parasites 


An RNA- interference screen has identified the protein CD55, expressed on the surface of red blood cells, as an essential 
receptor for infection of the cells by the malaria parasite Plasmodium falciparum. 


WAI-HONG THAM & ALEXANDER T. KENNEDY 


here are trillions of cells in the human 
| body and around 200 distinct cell types, 
some of which can be hijacked as safe 
houses by deadly pathogens. Malaria para- 
sites, which infect millions of humans annu- 
ally, have a preference for residing in liver and 
blood cells. To enter these cells, the parasites 
make proteins that recognize other proteins on 
the target cell’s surface. By analogy, the para- 
site proteins fit in a key-like fashion to their 
locks, the red blood cell proteins. But only a 
handful of these lock-and-key combinations 
has been discovered. Writing in Science, Egan 
et al.' combine two exciting technologies — 
RNA interference and ex vivo production of 
red blood cells — to identify other receptor 
proteins involved in malaria-parasite entry. 
In 1975, live imaging of malaria para- 
sites entering red blood cells highlighted a 
dynamic process with distinct observable 
steps’. The form of the malaria parasite that 
enters these cells is a merozoite, characterized 
by an ovoid shape and an apical tip. After initial 
contact with the red blood cell, the merozoite 
reorients so that its apical tip is in close prox- 
imity to the cell surface. Parasite proteins in 
the apical prominence recognize red blood 
cell proteins, triggering irreversible commit- 
ment to entry — turning the key in the lock 
and opening the door. Subsequently, a tight 
junction forms between parasite and red 
blood cell membranes, propelling the parasite 
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into the cell and beginning the blood stage of 
infection. Intensive efforts to develop a vaccine 
against this stage have focused on identifying 
all the lock-and-key combinations involved in 
parasite invasion and on ways to block parasite 
entry. 

Egan et al. have performed the first large 
knockdown screen of red blood cell proteins 
that are bound by the most lethal human 
malaria parasite, Plasmodium falciparum. 
The researchers targeted 42 genes that encode 
proteins determining blood group, on the basis 
that all P falciparum receptors known so far 
belong to this protein group and that these 
genes are not involved in red blood cell pro- 
duction (Fig. 1). The authors inserted a library 
of small hairpin RNA molecules (shRNAs) that 
bind to, and thus inhibit expression of, these 
genes into isolated haematopoietic progenitor 
cells (which give rise to all blood cells), and 
then induced these cells to proliferate and dif- 
ferentiate into red blood cells. Once the cells 
were mature enough to sustain parasite devel- 
opment, they were infected with P falciparum 
expressing green fluorescent protein, thus 
allowing infected cells to be identified. 

The authors then sequenced the shRNAs 
in these cells and compared the levels of each 
shRNA between infected and non-infected cell 
populations. Positive targets were identified 
as genes whose corresponding shRNAs were 
under-represented in the infected popula- 
tion, following the logic that inhibiting these 
genes impairs parasite infection. Among the 
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Figure 1 | RNA interference identifies malaria-parasite entry 

receptors. a, To identify proteins on red blood cells to which malaria 
parasites bind, Egan et al.' created a library of short hairpin RNA (shRNA) 
molecules that block the expression of genes encoding blood-group proteins. 
These shRNAs were transduced into haematopoietic progenitor cells (HPCs), 
and the HPCs were induced to proliferate and differentiate into red blood 
cells. The authors then infected the cells with fluorescent malaria parasites and 


authors’ positive hits were genes encoding the 
proteins basigin and complement receptor 1 
(CR1), which are known to be involved in 
P falciparum invasion**. More excitingly, they 
identified the cell-surface protein CD55 as the 
top-ranked candidate for a new entry portal 
for P falciparum. 

A key aspect of this discovery is that CD55 is 
an essential lock in the invasion process: Egan 
and colleagues show that diverse P falciparum 
laboratory strains and field isolates could not 
infect red blood cells lacking CD55. The para- 
site uses several lock-and-key combinations to 
enter red blood cells, to maximize its opportu- 
nities for entry’, and this redundancy presents 
a huge hurdle to the development of vaccines 
to block blood-stage infection. Therefore, 
interactions that are essential in the invasion 
process provide attractive vaccine candidates. 
The effect of loss of CD55 on parasite invasion 
is similar to the loss of basigin, whose parasite 
protein partner PfRh5 is a leading blood-stage 
vaccine candidate®. Clearly, an outstanding 
question stemming from this work is the iden- 
tification of the P. falciparum protein that binds 
to CD55. If CD55 is a crucial host factor for 
P falciparum invasion, the hypothesis would 
be that the parasite ligand is also an essen- 
tial factor and thus warrants inclusion in the 
vaccine-development pipeline. 

Egan et al. also show that loss of CD55 on 
red blood cells affects the proliferation of 
diverse P. falciparum strains. The authors 
suggest that CD55 may be involved in the 


compared the abundance of shRNA molecules in uninfected and 

infected cells; those shRNAs that were under-represented in the infected 
population were determined to correspond to genes encoding receptor 
proteins for parasite entry, because inhibition of these genes would inhibit the 
parasite’s ability to infect. b, Alongside the known parasite receptors CR1 and 
basigin, the authors identified several new candidates, of which CD55 was 


irreversible-commitment phase of parasite 
invasion. This hypothesis should be further 
explored using live and high-resolution imag- 
ing of merozoites attempting to infect CD55- 
deficient red blood cells. It will be interesting 
to determine whether the loss of CD55 affects 
the establishment of commitment, deforma- 
tion of the red blood cell surface or signalling 
for tight-junction formation during the early 
stages of parasite invasion. 

Could CD55 be a therapeutic target for 
malaria infections? Although some healthy 
humans exist without CD55 on their blood 
cells, important caveats arise for individuals 
living in malarious regions. Field studies show 
acorrelation between declining CD55 levels on 
red blood cells and severe malarial anaemia, 
potentially due to the destruction of red blood 
cells by the complement system’, an arm of 
the immune system that is regulated by CD55. 
An alternative avenue could be to explore the 
potential of soluble CD55 as a competitor to 
red blood cell CD55 for merozoite binding, 
and thus an inhibitor of parasite growth. 

An emerging theme in P. falciparum 
invasion is that the parasite exploits comple- 
ment regulators as entry receptors. In humans, 
CR1 and CD55 protect self-tissues from com- 
plement attack. A variable currently missing 
from experimental work monitoring P falci- 
parum invasion is the addition of active com- 
plement-system components during parasite 
entry. For example, complement activation 
is known to modulate the behaviour of both 
CD55 and CR1 on membranes, resulting in 
changes in red blood cell deformability that 
may affect merozoite entry*”. Binding of para- 
site ligands to these receptors may interfere with 
their regulatory roles (as is the case with CR1), 
and the consequences of this for red blood cell 
survival during infection need to be under- 
stood. Current research has led to a remarkable 
deconstruction of the distinct steps in P falci- 
parum invasion”, but future challenges will be 
to understand merozoite entry in the context of 
complement activation and immune attack. m 
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Nuclear dilemma 


resolved 


After cell division, membranes become fused around the nucleus to encapsulate 
the cell’s chromosomes. It emerges that this process is regulated by the ESCRT- III 
protein complex. SEE LETTERS P.231 & P.236 


BRIAN BURKE 


T= chromosomes of animal, plant 
and fungal cells are enclosed within 
a nucleus, which is encapsulated by 
a membranous structure called the nuclear 
envelope. This poses a problem when the 
time comes for cells to divide, because chro- 
mosomes segregate into two daughter cells 
by binding to and moving along a structure 
called the mitotic spindle, which, at least in 
multicellular organisms, is located outside 
the nucleus. In vertebrate cells, the nuclear 


envelope is normally partly or completely 
disassembled before chromosome segre- 
gation, and new envelopes are assembled 
afterwards — a process that requires exten- 
sive membrane fusion. However, the mecha- 
nisms by which membrane fusion occurs 
have long puzzled cell biologists. In this issue, 
Olmos et al.' (page 236) and Vietri et al.’ 
(page 231) reveal that the nuclear envelope 
co-opts a membrane-sculpting protein 
complex called ESCRT-III to bring about 
reassembly. 

The nuclear envelope completely encloses an 


b Reassembly 


Spastin 


ESCRT-II| Mitotic 


spindle 


Figure 1 | Resealing the nuclear envelope. a, The nuclear envelope is composed of inner and outer 
nuclear membranes (INM and ONM, respectively). The two are joined at junctions filled by nuclear 
pore complexes (NPCs), and the ONM is joined to another membranous structure, the endoplasmic 
reticulum (ER). During cell division, the nuclear envelope disassembles, and INM proteins disperse into 
the ER. b, As the cell divides, chromosomes (which separate and then become decondensed) are pulled 
to opposite poles of the cell by a structure called the mitotic spindle. Vietri et al.” demonstrate that the 
ESCRT-III protein complex recruits the enzyme spastin to sever the mitotic spindle. Both this group and 
Olmos et al.' show that ESCRT-III then promotes resealing of the nuclear envelope (black arrows). 
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essentially spherical space* and is composed of 
inner and outer nuclear membranes. The two 
membranes are periodically joined at annular 
junctions, forming channels that connect the 
inside of the nucleus with the cell’s cytoplasm. 
These channels are occupied by multiprotein 
nuclear pore complexes (NPCs), which regu- 
late the trafficking of macromolecules across 
the envelope. 

The outer nuclear membrane is also 
connected to a membrane network called 
the endoplasmic reticulum, which permeates 
much of the cytoplasm. As such, the inner and 
outer nuclear membranes and the endoplas- 
mic reticulum constitute a single continuous 
membrane system. When the nuclear enve- 
lope breaks down during cell division, NPCs 
disassemble and the nuclear membranes are 
peeled open. This causes the constituents of the 
nuclear membrane to disperse into the endo- 
plasmic reticulum, where the proteins of the 
two structures become intermingled’ (Fig. 1a). 

The nuclear membranes start to re-form as 
the chromosomes segregate to opposite poles 
of the cell. Membranes from the endoplasmic 
reticulum attach to and spread out over the 
surfaces of the mass of chromosomes, a pro- 
cess that is mediated by proteins of the inner 
nuclear membrane (Fig. 1b). But what mecha- 
nism is in place to close gaps within the mem- 
branes that eventually surround the daughter 
nuclei? Perhaps these holes are never actually 
sealed, but are instead plugged by reassem- 
bled NPCs. But this cannot be the whole story, 
because a sealed envelope can form even in the 
absence of NPCs’. Indeed, the enzyme p97 can 
drive fusion at annular junctions between the 
inner and outer nuclear membranes’ to seal 
NPC-free holes in the nuclear envelope. How- 
ever, a full understanding of this fusion process 
has remained out of reach. 

The ESCRT-III complex is known”* to have 
roles in the formation of certain intracellular 
vesicles, in the budding of retroviruses from 
the membranes of infected cells, and in the 
abscission process that separates two daughter 
cells at the end of cell division. What all these 
seemingly disparate events have in common is 
that they involve membrane fusion, which gen- 
erates a membrane-bound compartment that 
is separate from, but topologically identical to, 
the cell’s cytoplasm. In each case, components 
of ESCRT-II act as a molecular drawstring 
that constricts the neck of a membrane bud 
(or even of an entire cell) to promote annular 
fusion. This process is strikingly similar to the 
topological changes that occur when holes in 
the re-forming nuclear membranes are closed. 

The current studies'* demonstrate that com- 
ponents of ESCRT-III accumulate transiently 
at the edge of gaps in re-forming nuclear 
membranes — just as would be expected if 
the complex mediated the fusion of nuclear 
membranes. Such a role is borne out by the 
observation, made by both groups, that deple- 
tion of the components of ESCRT-III results 
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in failure to seal the nuclear envelope. Olmos 
et al. also show that p97 and its cofactor pro- 
tein, UFD1, are essential for the recruitment 
of key ESCRT-III subunits to the re-forming 
envelope. 

Vietri et al. further reveal that ESCRT-III 
has a complementary role in disassembling 
the mitotic spindle. The microtubule struc- 
tures that make up much of the spindle are 
attached to separating chromosomes, so they 
must be eliminated before the nuclear enve- 
lope can be sealed. Vietri and colleagues show 
that this elimination is carried out by spastin, a 
microtubule-severing protein that is attracted 
to the spindles by ESCRT-III. 

These authors find that interference with 
spastin results in delayed disassembly of the 
spindle, and prolonged association of ESCRT- 
III with the re-forming envelope. Not surpris- 
ingly, interference with ESCRT-III also impairs 
spindle disassembly. So ESCRT-III and spastin 
coordinate spindle disassembly with closure of 
the nuclear envelope. This represents a striking 
parallel with abscission, in which spastin sev- 
ers the spindle microtubules that pass between 
the two daughter cells. 

Together, the current studies reveal a 
previously unknown role for ESCRT-III in 
re-forming the nuclear envelope. The asso- 
ciation, albeit transient, between ESCRT-III 
and nuclear membranes raises the question of 
whether this complex, or a functional equiva- 
lent, might have other roles in envelope main- 
tenance. Indeed, there are several situations in 
which such activity might be required. 

For instance, some macromolecular 
complexes in fruitflies are exported from 
the nucleus by budding through the inner 
nuclear membrane, bypassing NPCs’. Capsid 


NANOPHOTONICS 


structures containing DNA from herpes 
simplex viruses exit the nucleus in a similar 
manner’’. These movements involve the type 
of membrane remodelling that is a hallmark 
of ESCRT-III. More dramatically, the Vpr pro- 
tein, which is produced by HIV, is associated 
with transient ruptures of the nuclear enve- 
lope’', and the membrane is again probably 
resealed through a similar mechanism. Finally, 
the elimination of misassembled NPCs in yeast 
has been shown” to depend on ESCRT-IIL. In 
the light of these phenomena, it would be no 
great surprise if ESCRT or ESCRT-like com- 
plexes were shown to have other, hitherto 
unappreciated, roles in nuclear-envelope 
dynamics. m 
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Bright future for 
hyperbolic chips 


The unusual properties of hyperbolic metamaterials, such as their ability to 
propagate light on the nanoscale without diffraction, have been realized in 
two-dimensional devices, heralding improved photonic circuits. SEE LETTER P.192 


GUY BARTAL 


evices known as photonic integrated 
De=«" could succeed electronic 

circuits in future data-storage, com- 
putation and communications technologies, 
because they would allow improved data 
bandwidths and lower energy consump- 
tion. But such devices lag behind their elec- 
tronic counterparts because they are limited 
by diffraction effects that restrict their 
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applications to micrometre scales, whereas 
electronics have already reached the nano- 
metre scale. This shortcoming is due to the 
fact that the electromagnetic properties of 
typical optical media hinder the relay of tiny 
optical features. If a beam narrower than 
(or comparable to) the wavelength of light 
travels through such media, it will either be 
distorted when it reaches its destination, 
because of diffraction, or it will not get there 
at all, because of exponential decay. This is a 


REF. 3 


Figure 1 | Normal versus negative refraction in a hyperbolic metasurface 
(HMS). This set of images illustrates the effect of wavelength on the 
refraction of optical beams as they impinge on the interface of a silver 

film with the HMS grating, in a device built by High and colleagues’. 

The grating consists of nanoscale grooves. In a and b, a beam is refracted in 
the normal sense, whereas in c and d negative refraction occurs 


fundamental limitation of propagating waves. 

On page 192 of this issue, High et al.° 
report the first experimental realization of 
two-dimensional ‘hyperbolic metasurfaces’ 
(HMSs)**. The authors’ HMSs exhibit a 
range of unconventional properties, includ- 
ing colour-dependent negative refraction and 
diffraction-less propagation, coupled with low 
optical-transmission losses — all packed ina 
tiny chip. 

Hyperbolic metamaterials (HMMs) are 
artificial structures whose optical properties 
are highly direction dependent. They are made 
of ultrathin multilayers® or dense nanowire 
arrays’, and are renowned for their ability to 
overcome the diffraction limit by enabling the 
propagation of ultra-small features of electro- 
magnetic waves**""°. Moreover, they can sup- 
port greater photon energy densities than can 
conventional materials, thereby enhancing the 
interaction of light with matter’’’’ — a prop- 
erty that can lead to improved signal modu- 
lation and decreased energy consumption. 
These are key ingredients for bringing HMMs 
to the front line of integrated circuitry, on a 
par with electronics. Their unusual properties 
could also expand their applicability beyond 
that of run-of-the-mill optical media. 

Until recently, HMMs have been fabricated 
only in three-dimensional configurations, 
making them unsuitable for integration on flat 
chips. Furthermore, these composite devices 
often contain metallic parts that absorb light 
and cause losses from resistivity, weakening 
their electromagnetic-power throughput. Also, 
preventing diffraction requires a certain design 
that inevitably maximizes the damping of elec- 
tromagnetic waves”, reducing the waves’ effec- 
tive propagation distances to less than 1 um. 

High and colleagues overcame these 
issues by fabricating an HMS consisting of a 
nanoscale grating on a single-crystal silver film 
— a design that can prevent diffraction with- 
out causing excessive losses from resistivity. 
Moreover, using sophisticated crystal-growth 
techniques and cutting-edge patterning meth- 
ods, the authors were able to further minimize 


transferred. 


both resistivity and scattering losses and to 
achieve operational propagation distances. 

What new on-chip functionalities result 
from this work? The hallmark property of 
HMMs is negative refraction, the ability to 
bend a beam that crosses from one medium 
into the HMM in the ‘wrong’ direction — 
essentially, breaking the law of refraction. 
Negative refraction is not typically observed 
in naturally occurring materials, but it has 
been demonstrated in various metamaterials 
in the past 15 years®'*"*. Not only have High 
et al. produced the first chip to exhibit negative 
refraction, but they have also shown that the 
effect can be wavelength dependent (Fig. 1); 
that is, their device allows certain colours of 
visible light to be refracted in the ‘wrong’ sense, 
whereas others refract normally. 

This property could facilitate wavelength- 
based switching and routing of light in pho- 
tonic circuits. No less importantly, it could 
be used to counter the natural tendency ofa 
tightly focused light beam to expand as it trav- 
els, because the transition from normal to neg- 
ative refraction occurs at a certain wavelength 
that depends on the material's design. At this 
wavelength, the beam impinging on the HMS 
does not diffract, but propagates unimpeded 
without sideways loss of energy, irrespective of 
the beam’s launch angle or width (see Fig. 3a, b 
of High and colleagues’ paper’). The devices 
built by the authors take advantage of this 
effect, so that each groove of the grating can 
channel this particular wavelength, regardless 
of how closely spaced the grooves are or how 
small their intrinsic width is compared with 
the wavelength in question. In fully fledged 
HMS devices, this would allow a substantial 
increase in the information capacity trans- 
ferred across small chips. Diffraction-less 2D 
imaging could be one of many other potential 
applications. 

High and colleagues further demonstrate 
that they can selectively route light beams 
of visible frequency not only by the beams’ 
colour, but also by the photons’ spin. Spin is 
a fundamental signature of photons, and is 
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(the respective wavelengths are labelled). In each case, the dotted box 
indicates the area covered by the HMS and the solid line indicates 

the angle of refraction. Devices based on nanoscale photonic circuits 
will be able to exploit this phenomenon to facilitate wavelength-based 
switching and routing of light and to increase the amount of information 


associated with the circular polarization of 
electromagnetic waves (the direction of rota- 
tion of the electric field in time and space). In 
one of the devices demonstrated in the cur- 
rent work, a beam of left-handed polarization 
is diverted to a direction opposite to that of 
a right-polarized beam. Although this phe- 
nomenon has been previously demonstrated 
in metasurfaces!° and HMMs", what is unique 
here is the combination in prototype devices 
of colour sensitivity, polarization-dependent 
refraction, enhanced light—matter interaction 
and significant reduction in optical losses. The 
ability to encapsulate these desirable proper- 
ties on a chip could form the backbone of a 
robust photonic system, suitable not only for 
high-capacity data transmission, but also for 
quantum-communications and quantum- 
memory applications. m 
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The micronucleus 
gets its big break 


Extensive chromosomal rearrangement — chromothripsis — is seen in several 
cancers. Imaging and sequencing of single cells shows that this phenomenon can 
occur inside cellular anomalies known as micronuclei. SEE ARTICLE P.179 


KRISTIN A. KNOUSE & ANGELIKA AMON 


through the gradual accumulation of 

genetic alterations, but recent analyses of 
cancer genomes have challenged the universal- 
ity of this hypothesis. At least 2% ofall cancers, 
and more than 10% of brain cancers, exhibit 
chromothripsis — the extensive rearrange- 
ment of one or a few chromosomes". In these 
cases, it seems that a chromosome shatters into 
pieces and is then stitched back together, with 
some segments being reincorporated, albeit in 
arandom order and orientation, while others 
are left behind. Modelling suggests that these 
rearrangements arise from a single catastrophic 
event, rather than through several independent 
or consecutive chromosomal rearrangements’. 
On page 179 of this issue, Zhang et al.* show 
that chromothripsis can occur when faulty 
cell division results in the formation of struc- 
tures called micronuclei that contain isolated 
chromosomes. 

Perhaps the most peculiar aspect of chromo- 
thripsis is that the rearrangements are largely 
confined to a single chromosome. This suggests 
that the causative event occurs during cell divi- 
sion (mitosis), when chromosomes are spatially 
distinct, rather than during the remainder of the 
cell cycle (interphase), when the chromosomes 
are closely juxtaposed. Insight into the cause 
of chromothripsis has come from studies of 
chromosome mis-segregation. Occasionally, 
individual chromosomes fail to attach properly 
to the mitotic spindle, a structure that forms 
during cell division to segregate chromosomes. 
Improperly attached chromosomes lag in the 
spindle midzone when the other chromosomes 
are pulled to opposite spindle poles, and these 
lagging chromosomes either segregate to the 
proper daughter cell or end up in the incorrect 
daughter cell. Either way, their delayed segrega- 
tion means that they often fail to incorporate 
into the main nucleus of the daughter cells, 
but instead form a smaller satellite nucleus, or 
micronucleus (Fig. 1). 

Micronuclei are not a safe place for chromo- 
somes: their membranes are prone to rupture, 
which exposes the DNA to the cytoplasm’, and 
they have reduced import of DNA replication 
and repair factors, which can lead to DNA 
damage when DNA is replicated during the 


I: has long been assumed that cancer evolves 
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S phase of the cell cycle® . Thus, micronuclei 
provide an environment ripe for chromothrip- 
sis, but there has been no direct evidence that 
it actually occurs in this context. 

Zhang et al.* demonstrate a direct associa- 
tion between micronuclei and chromothripsis 
by combining live-cell imaging and whole- 
genome sequencing (a combined technique 
referred to as ‘LookSeq). The authors induced 
micronucleus formation by transiently treating 
cells with nocodazole, a chemical agent that 
destabilizes the mitotic spindle and increases 
the frequency of improper chromosome 
attachment, lagging chromosomes and micro- 
nuclei. Through live imaging, the authors iden- 
tified micronucleated cells that underwent 
micronuclear rupture during S phase and that 
subsequently divided to produce two daughter 


a Proper segregation 


cells. The authors then isolated and sequenced 
each daughter cell (Fig. 1). 

But how could the researchers determine 
whether being in a micronucleus affects chro- 
mosomal structure? They show that DNA 
in micronuclei is poorly replicated and thus 
assumed that the micronuclear chromosome 
would not be equally distributed to the two 
daughter cells after cell division. This process 
would generate an asymmetry in the number of 
chromosomes (copy number) and their paren- 
tal origin (haplotype) between the two daughter 
cells, enabling identification of the micronu- 
clear chromosome. For example, imagine that 
nocodazole treatment caused the maternal 
copy of chromosome 2 to segregate properly 
but form a micronucleus. This micronucleated 
cell would then replicate all the chromosomes 
but under-replicate maternal chromosome 2 
in the micronucleus. After cell division, one 
daughter cell would have two copies of chro- 
mosome 2 (the normal paternal haplotype and 
the micronuclear maternal haplotype), and the 
other daughter cell would have only one copy 
of chromosome 2 (the paternal haplotype). A 
copy-number and haplotype asymmetry would 
also exist if the lagging chromosome segregated 
improperly. 

Zhang et al. found one or two chromo- 
somes showing copy-number asymmetry in 
all the daughter-cell pairs they sequenced. 
They then used cells in which only one copy 
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Figure 1 | Following the fate of a micronucleus. Zhang et al.’ treated dividing cells with the chemical 
nocodazole to generate lagging chromosomes that either segregate properly (a) or mis-segregate (b); in 
both cases, extranuclear, chromosome-containing structures called micronuclei were formed in one of 
the two daughter cells. The authors imaged these cells through another complete cell cycle and observed 
the micronuclei undergoing membrane rupture and inaccurate DNA replication. The micronucleus 
and the primary nucleus then fused and the cells divided; the chromosome that had been within the 
micronucleus was partitioned to just one of the daughter cells. By sequencing both daughter cells, the 
authors were able to identify the micronuclear chromosome on the basis of its copy-number asymmetry. 
These chromosomes contained structural rearrangements that resemble the ‘chromosome shattering’ 


known as chromothripsis. 
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of a chromosome remained to determine the 
arrangement of sequence variations for each 
chromosome haplotype. This allowed them to 
use sequence data to calculate the copy number 
of each haplotype in each cell and thereby iden- 
tify the putative micronuclear chromosome. 
The authors’ sequence data further showed 
that micronuclear chromosomes had signifi- 
cantly more structural rearrangements than 
other chromosomes. Most of the rearrange- 
ments seemingly arose through a breakage and 
rejoining mechanism, which could be second- 
ary to faulty DNA replication in the micronu- 
cleus. The transient lifespans of micronuclei 
and their tendency to contain only one or two 
chromosomes thus elegantly account for the 
focal nature of chromothripsis. Presumably, 
once the cell proceeds into the next division 
and the micronucleus reincorporates into the 
main nucleus, the damaged chromosomes are 
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exposed to appropriate levels of replication and 
repair factors, and the rearrangements could 
become stabilized over subsequent generations. 
The discovery of mitotic defects as an origin 
of chromothripsis provides further evidence 
that chromosome mis-segregation and DNA 
rearrangements, both of which are observed 
in tumour cells, can be mechanistically linked’. 

Zhang et al. identified extensive rearrange- 
ments in nearly all micronuclear chromosomes, 
indicating that ruptured micronuclei lead to 
extensive mutation. Examining micronucle- 
ated cells before DNA replication, and cells 
in which micronuclei do not rupture, could 
reveal whether DNA replication is required 
for chromothripsis, and determine whether 
chromothripsis is but one of many possible 
outcomes for micronuclear chromosomes. 
LookSeq is a powerful method to address 
these and many other questions, providing 


Timing is everything 
during deglaciations 


Links between various climate records for the North Atlantic Ocean and the 
Mediterranean Sea have helped to identify a potential mechanism that enhanced 
sea-level rise during the last interglacial time interval. SEE LETTER P.197 


KATHARINA BILLUPS 


s everyone who enjoys a good murder 
mystery knows, establishing the 
modus operandi of the villain is crucial 
to solving the crime. Similarly, understanding 
the forces that drive climate change requires an 
unambiguous reconstruction of the sequence 
of events involved. In this vein, Marino et al.’ 
(page 197) have unravelled a chain of events 
characterizing the melting of the large polar 
ice sheets that existed about 140,000 years ago. 
This deglaciation led to the last interglacial 
interval — the last time that Earth underwent 
an interval of peak warmth between glacial 
periods, and at which the sea level was similar 
to or perhaps slightly higher than it is today’. 
The findings reveal fundamental differences 
between the two most recent glacial-to- 
interglacial transitions. 
Numerous publications** have provided 
a comprehensive picture of the timings of 
events that make up the most recent glacial- 
to-interglacial transition, which began 20,000 
years ago. These events culminated in our 
current interglacial epoch, the Holocene. 
But such a detailed picture is more difficult 
to assemble further back in time, given the 
inherent difficulties in dating older geological 
materials. 


The problem is that the availability of 
radiometrically dated materials needed to 
determine an accurate sequence of events 
decreases the further back in time one goes. 
Other means of establishing ages must 
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knowledge on both the history of a cell and the 
architecture of its genome. m 
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therefore be used. But ages derived from, 
for example, astrochronology (which allows 
sediments to be dated using timescales cali- 
brated by astronomical events), are more often 
than not further apart than the events of inter- 
est. This plagues all studies trying to resolve 
rapid climate changes occurring on timescales 
of about 10,000 years or less, which includes 
the timescale over which major deglaciations 
take place. 

In their search for the chain of climatic 
events leading up to the last interglacial inter- 
val, Marino and colleagues have provided 
a solution to the age-model problem. They 
adjusted individual climate proxy records for 
the ocean (oxygen-isotope records from the 
fossilized remains of unicellular marine organ- 
isms called foraminifera) to oxygen-isotope 
records derived from speleothems — inorganic 
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Figure 1 | Glacial-to-interglacial transitions. a, Termination I (TI) was the most recent time during 
which Earth passed from a glacial to an interglacial period. The Last Glacial Maximum represents the 
period when ice sheets were at their maximum extent. This was followed by a cooling event (Heinrich 
Stadial 1) and then the main phase of deglaciation, meltwater pulse 1A (MWP- 1A). Deglaciation 

was interrupted by a return to almost full glacial conditions — the Younger Dryas — before the final 
deglaciation phase, MWP-1B. Ages for individual events in TI are taken from refs 4 and 9. b, Marino 

et al.' report the sequence of events for Termination II, the penultimate glacial-to-interglacial transition. 
In contrast to TI, the cooling event (Heinrich Stadial 11) coincided with the main deglaciation phase 
(MWP-2B). MWP-2A represents a relatively minor meltwater pulse earlier in the transition. 
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carbonate deposits in caves — which have 
radiometrically constrained chronologies for 
this interval of time. 

This approach is not new. But the novelty 
of the current work lies in the fact that all 
records come from the Mediterranean region, 
and are thus naturally coupled through the 
local hydrological cycle and through oxygen- 
isotope fractionation within it, thereby pro- 
viding a basis for clear correlations. Once the 
oxygen-isotope records from different Medi- 
terranean sites are placed in a common tempo- 
ral framework, the relative timing of associated 
climate parameters emerges. Temporal rela- 
tionships can then be established between 
changes in sea surface temperatures, meltwater 
pulses and the deposition of ice-rafted debris 
onto the ocean floor. 

The revised chronology of the proxy records 
examined by Marino and co-workers points 
to a pivotal difference in climate dynamics 
between the most recent and the penultimate 
glacial-to-interglacial transitions, which are 
also known as Terminations I and II, respec- 
tively. Both are associated with times when 
summer insolation (incoming solar radiation) 
in the Northern Hemisphere and atmospheric 
carbon dioxide levels were increasing. One 
would therefore expect the ensuing ice-sheet 
melt-back behaviour to have been similar as 
well. But it was not. 

It is known that, during Termination I, 
there was a period of maximum cooling in 
the North Atlantic called Heinrich Stadial 1, 
which coincided with peak iceberg discharge 
(Fig. 1a). This was followed by the major phase 
of deglaciation, known as meltwater pulse 1A. 
Deglaciation was subsequently interrupted by 
areturn to almost full glacial conditions — the 
Younger Dryas — before the final ice retreat 
during meltwater pulse 1B. 

By contrast, Marino and colleagues’ 
chronology shows that the main phase of 
deglaciation (meltwater pulse 2B) during 
Termination II occurred during the period 
of maximum North Atlantic cooling and 
iceberg discharge (Heinrich Stadial 11). In 
short, Heinrich Stadial 1 preceded the major 
phase of ice-sheet retreat, whereas Heinrich 
Stadial 11 coincided with it (Fig. 1b). This 
means that not all terminations are equal, 
making it much more difficult to find an 
underlying forcing mechanism. 

North Atlantic cooling and enhanced ice- 
berg discharge during Heinrich Stadial 11 
might seem to be at odds with background 
climate warming and deglaciation. But as the 
authors point out, the new chronology also 
reveals that Heinrich Stadial 11 coincided 
with warming in the Southern Hemisphere, 
as recorded by ice cores’. Warming in one 
hemisphere coinciding with cooling in the 
other is a well-characterized phenomenon 
called the bipolar see-saw’. The term refers to 
ocean-surface heat transport from the South- 
ern to the Northern Hemisphere as part ofa 
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large-scale circulation process (the meridi- 
onal overturning circulation) in the Atlantic 
Ocean. During Heinrich Stadial 11, relatively 
slow ocean circulation allowed heat to build 
up in the Southern Hemisphere. The authors 
suggest that this warming stimulated melting 
of the Antarctic ice sheet, contributing to the 
enhanced sea-level rise associated with the 
penultimate deglaciation. 

Marino and co-workers’ study exemplifies 
how the nature of the temporal ties between 
climate records can affect the reconstruction 
and understanding of climate events. The 
researchers provide a specific solution to the 
timing of events during Termination II. How- 
ever, the generality of their approach is limited 
by the assumptions that need to be made about 
climatological links between radiometrically 
dated speleothem and marine proxy records 
from dissimilar oceanographic regions. 

Rapid deglaciations such as Terminations I 
and II are part of an asymmetric climate pat- 
tern that begins with slow ice-sheet build-up 
followed by rapid ice-sheet melt-back. This 
sequence repeats on timescales of about 
100,000 years and first appears in the geologi- 
cal record of climate change about 900,000 
years ago’. There are no obvious direct external 
forcing mechanisms for this pattern, unlike the 
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shorter climate cycles that occur on timescales 
of 41,000 to 19,000 years. The evolution of the 
100,000-year climate cycle is therefore one of 
the big unsolved mysteries in palaeoceano- 
graphic research*®. A robust temporal reference 
frame for the sequence of events defining each 
deglaciation, such as that assembled by Marino 
et al. for the penultimate one, should help to 
build a consensus about the modus operandi 
behind this climate pattern. = 
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Ancient DNA steps into 
the language debate 


Two studies of ancient human DNA reveal expansions of Bronze Age populations 
that shed light on the long-running debate about the origins and spread of 
Indo-European languages. SEE ARTICLE P.167 & LETTER P.207 


JOHN NOVEMBRE 


he archaeological adage that pots are not 
[os expresses the challenge of using 
cultural artefacts to trace the movement 

of populations. To surmount this obstacle, 
archaeologists and population geneticists are 
joining forces to extract DNA from human 
remains that are found with archaeological 
evidence of ancient cultures. In this issue, Haak 
etal.' (page 207) and Allentoft et al. (page 167) 
report two of the largest studies of ancient 
DNA to date. Combined, the studies analyse 
170 samples, and each group brings evidence to 
bear on along-standing controversy about the 
origins of the Indo-European language family. 
Indo-European languages have been spoken 
across Europe and in central and southern 
Asia since the beginning of recorded history. 
This is a broad language family, including 
Italic, Germanic, Slavic, Hindi and Tocharian 
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languages, among others. When and where the 
precursor of these languages began to spread 
has long been a subject of debate’. There are 
two main theories: the Anatolian and the 
steppe hypotheses. 

The Anatolian hypothesis posits that 
Proto-Indo-European spread with farming 
out of Anatolia (a region that lies within mod- 
ern-day Turkey) during the Neolithic period, 
approximately 7000 Bc. Some archaeological 
and genetic data support this hypothesis, as 
does a phylogenetic analysis of linguistic data’. 
By contrast, the steppe hypothesis** supposes 
that Proto-Indo-European spread from the 
Pontic-Caspian steppe (a region of modern- 
day Russia, Ukraine and Kazakhstan that lies 
north of the Black Sea and stretches eastwards 
to the Caspian Sea; Fig. 1a). Recent versions of 
the hypothesis argue that the language spread 
during the late Copper Age and early Bronze 
Age, between 3700 Bc and 2000 Bc, carried 


Figure 1 | The spread of Indo-European languages. a, Indo-European 
languages have been spoken across a broad area of Eurasia throughout recorded 
history (countries in which these languages are spoken today are marked in 
green). Two geographical origins for these languages have been proposed: 
Anatolia and the Pontic-Caspian steppe. b, Haak et al.’ and Allentoft et al.’ 
analysed ancient DNA taken from samples from across Europe and central 
Asia. Their data point to human migration from a steppe culture, the Yamnaya. 


by pastoralist horse-riders empowered by the 
innovation of wheeled wagons — probably 
the people associated with a culture known to 
archaeologists as the Yamnaya’. 

Previous studies of both modern and ancient 
DNA” have suggested an influx of people into 
Europe from modern Eurasia after the spread 
of Neolithic farming. However, the details have 
been hazy, and any role for steppe populations 
has been unclear. To resolve this uncertainty, 
Haak et al. and Allentoft et al. obtained ancient 
human DNA samples from a broad swathe of 
archaeological cultures from Europe and cen- 
tral Asia dating from around 6000 Bc to 900 Bc. 
Although each study used different strategies 
(Allentoft et al. sequenced whole genomes, 
whereas Haak et al. targeted select regions), 
the groups successfully obtained around 
101 and 69 samples, respectively — a remark- 
able achievement. The sequencing data are 
poor by the standards applied to modern DNA, 
but they are sufficient to discern broad brush- 
strokes of human migration. 

Both studies found a genetic affinity between 
samples from a central European culture 
known as Corded Ware, which existed from 
around 2500 Bc, and samples from the earlier 
Yamnaya steppe culture. This similarity 
between distant populations is best explained 
by a substantial westward expansion of the 
Yamnaya or their close relatives into central 
Europe (Fig. 1b). Such an expansion is consist- 
ent with the steppe hypothesis, which argues 
that Corded Ware cultures were a conduit for 
the dispersal of Indo-European languages into 
Europe. The results also help to explain a mys- 
terious ancestry found in both Europe and the 
Americas*, and in ancient DNA from a boy who 
lived 24,000 years ago in eastern Siberia’. Both 
groups of researchers suggest that this ances- 
try entered Europe through the expansion of 
Yamnaya-related peoples, who are descended 
from the north Eurasian populations that con- 
tributed to the peopling of the Americas. 

The data also suggest that steppe popula- 
tions expanded eastwards. Allentoft et al. 


found that the Afanasievo culture from central 
Asia shows genetic affinity with the Yamnaya. 
They also found evidence to support theories 
of a back-migration from Corded Ware- 
related populations® that contributed to the 
origins of the Sintashta culture in the Urals 
and their descendants, the Andronovo. This 
is particularly interesting because the steppe 
hypothesis supposes that an eastward migra- 
tion of steppe-descendant populations helped 
to give rise to Tocharian, a branch of Indo- 
European once spoken in western China. 

Together, these studies argue that Bronze 
Age population movements were important in 
shaping the genetics of Eurasia. Ancient DNA 
cannot prove how language spread, of course, 
and more data will help to refine our under- 
standing, but expansions of Yamnaya-related 
peoples add weight to the steppe hypothesis. 
If genes were moving en masse, it is likely that 
words were too. 

It remains to be seen whether ancient DNA 
samples will also support the hypothesis that 
the Indo-Iranian branch of Indo-European can 
be traced to a southward migration from the 
steppe. Research on modern DNA” has pos- 
ited the existence of an ancient north Indian 
population — can this be linked directly to 
the steppe populations? The size of the ances- 
tral steppe populations and the rate at which 
they expanded also remain to be determined. 
Finally, the migratory models put forward by 
the two groups differ in terms of how Near 
Eastern populations such as Armenians relate 
to the steppe. Haak and colleagues’ model 
implies that Near Eastern populations con- 
tribute ancestry to the Yamnaya, but aspects of 
Allentoft and co-workers’ data do not support 
such admixture. Future studies should resolve 
these questions. 

Nonetheless, these two studies represent a 
milestone in a 200-year-old debate. Tragically, 
this debate figured heavily in the racist politics 
and science of the nineteenth and early twenti- 
eth centuries in Europe’. The current research 
takes place in a more open and humane 
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They conclude that the Corded Ware culture of central Europe had ancestry 
from the Yamnaya. Allentoft et al. also show that the Afanasievo culture to the 
east is related to the Yamnaya, and that the Sintashta and Andronovo cultures 
had ancestry from the Corded Ware. Arrows indicate migrations — those 
from the Corded Ware reflect the evidence that people of this archaeological 
culture (or their relatives) were responsible for the spreading of Indo-European 
languages. All coloured boundaries are approximate. 


framework, and this should be encouraged and 
protected. It will be exciting to see what the 
careful use of ancient DNA will reveal about 
the history of other language families and their 
speakers. 

The studies also foreshadow how studies of 
ancient DNA will empower studies of adap- 
tive evolution. For instance, Allentoft et al. 
observed that the spread of a mutation that 
allows humans to drink milk into adulthood 
began only in the Bronze Age, later than previ- 
ously supposed — a finding replicated by Haak 
and colleagues in another paper”. 

One final point is that the genomes of 
living people vary more or less continuously 
with geography in most regions of the globe”. 
But these two studies suggest that the relatively 
continuous patterns seen in modern DNA can 
be teased apart and understood in terms of a 
complex history of expansions and mixtures 
of ancient populations. Clearly, intersecting 
ancient DNA with archaeological and linguis- 
tic research promises to yield great progress in 
the study of prehistory. m 
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Population genomics of Bronze Age 
Eurasia 


Morten E. Allentoft'*, Martin Sikora’, Karl-Géran Sjogren’, Simon Rasmussen’, Morten Rasmussen!, J esper Stenderup', 
Peter B. Damgaard', Hannes Schroeder’, Torbjérn Ahlstrém®, Lasse Vinner!, Anna-Sapfo Malaspinas', Ashot Margaryan!, 
Tom Higham’, David Chivall®, Niels Lynnerup’, Lise Harvig’, Justyna Baron’, Philippe Della Casa’, Pawel Dabrowski!®, 
Paul R. Duffy", Alexander V. Ebel!?, Andrey Epimakhov’, Karin Frei'*, Mirostaw Furmanek®, Tomasz Gralak®, Andrey Gromov"’, 
Stanislaw Gronkiewicz'°, Gisela Grupe’”, Tamas Hajdu'®*!’, Radoslaw Jarysz”°, Valeri Khartanovich’’, Alexandr Khokhlov”, 
Viktoria Kiss”, Jan Kolat?*™*, Aivar Kriiska?’, Irena Lasak®, Cristina Longhi”®, George McGlynn”, Algimantas Merkevicius”’, 
Inga Merkyte?’, Mait Metspalu?’, Ruzan Mkrtchyan*°, Vyacheslav Moiseyev’”, Laszl6 Paja*+*?, Gyorgy Palfi®’, Dalia Pokutta?, 
Lukasz Pospieszny*®, T. Douglas Price**, Lehti Saap,, Mikhail Sablin®’, Natalia Shishlina®®, Vaclav Smréka®’, Vasilii I. Soenov’®, 
Vajk Szeverényi”’, Gusztav Toth*’, Synaru V. Trifanova’®, Liivi Varul?>, Magdolna Vicze*°, Levon Yepiskoposyan”, 

Vladislav Zhitenev**, Ludovic Orlando!, Thomas Sicheritz-Pontén’, Soren Brunak***, Rasmus Nielsen“*, Kristian Kristiansen? 
& Eske Willerslev! 


The Bronze Age of Eurasia (around 3000-1000 8c) was a period of major cultural changes. However, there is debate about 
whether these changes resulted from the circulation of ideas or from human migrations, potentially also facilitating the 
spread of languages and certain phenotypic traits. We investigated this by using new, improved methods to sequence 
low-coverage genomes from 101 ancient humans from across Eurasia. We show that the Bronze Age was a highly 
dynamic period involving large-scale population migrations and replacements, responsible for shaping major parts of 
present-day demographic structure in both Europe and Asia. Our findings are consistent with the hypothesized spread 
of Indo-European languages during the Early Bronze Age. We also demonstrate that light skin pigmentation in Europeans 
was already present at high frequency in the Bronze Age, but not lactose tolerance, indicating a more recent onset of 


positive selection on lactose tolerance than previously thought. 


The processes that created the genetic landscape of contemporary 
human populations of Europe and Asia remain contentious. Recent 
studies have revealed that western Eurasians and East Asians diverged 
outside Africa between 45 and 36.2 thousand years before present (45 
and 36.2 kyr Bp)!” and that East Asians, but not Europeans, received 
subsequent gene flow from remnants of an earlier migration into Asia 
of Aboriginal Australian ancestors at some point before 20 kyr Bp’. 
There is evidence that the western Eurasian branch constituted a 
meta-population stretching from Europe to Central Asia** and that 
it contributed genes to both modern-day western Eurasians* and early 


indigenous Americans**. The early Europeans received gene flow 
from the Middle East during the Neolithisation (transition from hunt- 
ing-gathering to farming) around 8-5 kyr Bp’ and possibly also 
from northern Asia’®. However, what happened hereafter, during 
the Bronze Age, is much less clear. 

The archaeological record testifies to major cultural changes in 
Europe and Asia after the Neolithic period. By 3000 Bc, the 
Neolithic farming cultures in temperate Eastern Europe appear to 
be largely replaced by the Early Bronze Age Yamnaya culture, which 
is associated with a completely new perception of family, property and 
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personhood'*"’, rapidly stretching from Hungary to the Urals'*. By 
2800 BC a new social and economic formation, variously named 
Corded Ware, Single Grave or Battle Axe cultures developed in tem- 
perate Europe, possibly deriving from the Yamnaya background, and 
culturally replacing the remaining Neolithic farmers'’*’’ (Fig. 1). In 
western and Central Asia, hunter-gatherers still dominated in Early 
Bronze Age, except in the Altai Mountains and Minusinsk Basin 
where the Afanasievo culture existed with a close cultural affinity to 
Yamnaya’> (Fig. 1). From the beginning of 2000 Bc, a new class of 
master artisans known as the Sintashta culture emerged in the Urals, 
building chariots, breeding and training horses (Fig. 1), and pro- 
ducing sophisticated new weapons’*. These innovations quickly 
spread across Europe and into Asia where they appeared to give rise 
to the Andronovo culture?” (Fig. 1). In the Late Bronze Age around 
1500 Bc, the Andronovo culture was gradually replaced by the 
Mezhovskaya, Karasuk, and Koryakova cultures’. It remains debated 
if these major cultural shifts during the Bronze Age in Europe and 
Asia resulted from the migration of people or through cultural dif- 
fusion among settled groups'*’, and if the spread of the Indo- 
European languages was linked to these events or predates them”. 


Archaeological samples and DNA retrieval 


Genomes obtained from ancient biological remains can provide 
information on past population histories that is not retrievable from 
contemporary individuals***. However, ancient genomic studies have 
so far been restricted to single or a few individuals because of the 
degraded nature of ancient DNA making sequencing costly and time 
consuming”. To overcome this, we increased the average output of 


authentic endogenous DNA fourfold by: (1) targeting the outer 
cementum layer in teeth rather than the inner dentine layer**”, 
(2) adding a ‘pre-digestion’ step to remove surface contaminants 
and (3) developing a new binding buffer for ancient DNA extraction 
(Supplementary Information, section 3). This allowed us to obtain 
low-coverage genome sequences (0.01-7.4X average depth, overall 
average equal to 0.7 X) of 101 Eurasian individuals spanning the entire 
Bronze Age, including some Late Neolithic and Iron Age individuals 
(Fig. 1, Supplementary Information, sections 1 and 2). Our data set 
includes 19 genomes, between 1.1-7.4 average depth, thereby doub- 
ling the number of existing Eurasian ancient genomes above 1X 
coverage (ref. 27). 


24,26 
> 


Bronze Age Europe 

By analysing our genomic data in relation to previously published 
ancient and modern data (Supplementary Information, section 6), 
we find evidence for a genetically structured Europe during the 
Bronze Age (Fig. 2; Extended Data Fig. 1; and Supplementary Figs 5 
and 6). Populations in northern and central Europe were composed of 
a mixture of the earlier hunter-gatherer and Neolithic farmer’? 
groups, but received ‘Caucasian’ genetic input at the onset of the 
Bronze Age (Fig. 2). This coincides with the archaeologically well- 
defined expansion of the Yamnaya culture from the Pontic-Caspian 
steppe into Europe (Figs 1 and 2). This admixture event resulted in the 
formation of peoples of the Corded Ware and related cultures, as 
supported by negative ‘admixture’ f, statistics when using Yamnaya 
as a source population (Extended Data Table 2, Supplementary Table 
12). Although European Late Neolithic and Bronze Age cultures such 
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Figure 1 | Distribution maps of ancient samples. Localities, cultural 
associations, and approximate timeline of 101 sampled ancient individuals 
from Europe and Central Asia (left). Distribution of Early Bronze Age cultures 
Yamnaya, Corded Ware, and Afanasievo with arrows showing the Yamnaya 
expansions (top right). Middle and Late Bronze Age cultures Sintashta, 
Andronovo, Okunevo, and Karasuk with the eastward migration indicated 
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(bottom right). Black markers represent chariot burials (2000-1800 Bc) with 
similar horse cheek pieces, as evidence of expanding cultures. Tocharian is the 
second-oldest branch of Indo-European languages, preserved in Western 
China. CA, Copper Age; MN, Middle Neolithic; LN, Late Neolithic; EBA, Early 
Bronze Age; MBA, Middle Bronze Age; LBA, Late Bronze Age; IA, Iron Age; 
BAC, Battle Axe culture; CWC, Corded Ware culture. 
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from different periods projected onto contemporary individuals from Europe, 
West Asia, and Caucasus. Grey labels represent population codes showing 
coordinates for individuals (small) and population median (large). Coloured 
circles indicate ancient individuals b, ADMIXTURE ancestry components 


as Corded Ware, Bell Beakers, Unetice, and the Scandinavian cultures 
are genetically very similar to each other (Fig. 2), they still display a 
cline of genetic affinity with Yamnaya, with highest levels in Corded 
Ware, lowest in Hungary, and central European Bell Beakers being 
intermediate (Fig. 2b and Extended Data Table 1). Using D-statistics, 
we find that Corded Ware and Yamnaya individuals form a clade to 
the exclusion of Bronze Age Armenians (Extended Data Table 1) 
showing that the genetic “Caucasus component’ present in Bronze 
Age Europe has a steppe origin rather than a southern Caucasus 
origin. Earlier studies have shown that southern Europeans received 
substantial gene flow from Neolithic farmers during the Neolithic’. 
Despite being slightly later, we find that the Copper Age Remedello 
culture in Italy does not have the ‘Caucasian’ genetic component and 
is still clustering genetically with Neolithic farmers (Fig. 2; Extended 
Data Fig. 1 and Supplementary Fig. 6). Hence this region was either 
unaffected by the Yamnaya expansion or the Remedello pre-dates 
such an expansion into southern Europe. The ‘Caucasian’ component 
is clearly present during Late Bronze Age in Montenegro (Fig. 2b). 
The close affinity we observe between peoples of Corded Ware and 
Sintashta cultures (Extended Data Fig. 2a) suggests similar genetic 
sources of the two, which contrasts with previous hypotheses placing 
the origin of Sintastha in Asia or the Middle East’*. Although we 
cannot formally test whether the Sintashta derives directly from an 
eastward migration of Corded Ware peoples or if they share common 
ancestry with an earlier steppe population, the presence of European 
Neolithic farmer ancestry in both the Corded Ware and the Sintashta, 
combined with the absence of Neolithic farmer ancestry in the earlier 
Yamnaya, would suggest the former being more probable (Fig. 2b and 
Extended Data Table 1). 


width of the bars representing ancient individuals is increased to aid 
visualization. Individuals with less than 20,000 SNPs have lighter colours. 
Coloured circles indicate corresponding group in the PCA. Probable 
Yamnaya-related admixture is indicated by the dashed arrow. 


Bronze Age Asia 

We find that the Bronze Age in Asia is equally dynamic and char- 
acterized by large-scale migrations and population replacements. The 
Early Bronze Age Afanasievo culture in the Altai-Sayan region is 
genetically indistinguishable from Yamnaya, confirming an eastward 
expansion across the steppe (Figs 1 and 3b; Extended Data Fig. 2b and 
Extended Data Table 1), in addition to the westward expansion into 
Europe. Thus, the Yamnaya migrations resulted in gene flow across 
vast distances, essentially connecting Altai in Siberia with Scandinavia 
in the Early Bronze Age (Fig. 1). The Andronovo culture, which arose 
in Central Asia during the later Bronze Age (Fig. 1), is genetically 
closely related to the Sintashta peoples (Extended Data Fig. 2c), and 
clearly distinct from both Yamnaya and Afanasievo (Fig. 3b and 
Extended Data Table 1). Therefore, Andronovo represents a temporal 
and geographical extension of the Sintashta gene pool. Towards the 
end of the Bronze Age in Asia, Andronovo was replaced by the 
Karasuk, Mezhovskaya, and Iron Age cultures which appear multi- 
ethnic and show gradual admixture with East Asians (Fig. 3b and 
Extended Data Table 2), corresponding with anthropological and 
biological research”. However, Iron Age individuals from Central 
Asia still show higher levels of West Eurasian ancestry than contem- 
porary populations from the same region (Fig. 3b). Intriguingly, indi- 
viduals of the Bronze Age Okunevo culture from the Sayano-Altai 
region (Fig. 1) are related to present-day Native Americans (Extended 
Data Fig. 2d), which confirms previous craniometric studies”. This 
finding implies that Okunevo could represent a remnant population 
related to the Upper Palaeolithic Mal’ta hunter-gatherer population 
from Lake Baikal that contributed genetic material to Native 
Americans’. 
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Figure 3 | Genetic structure of Bronze Age Asia. a, Principal component 
analysis (PCA) of ancient individuals (n = 40) from different periods projected 
onto contemporary non-Africans. Grey labels represent population codes 
showing coordinates for individuals (small) and population median (large). 
Coloured circles indicate ancient individuals. b, ADMIXTURE ancestry 
components (K = 16) for ancient (n = 40) and selected contemporary 


Spread of the Indo-European languages 

Historical linguists have argued that the spread of the Indo-European 
languages must have required migration combined with social or 
demographic dominance, and this expansion has been supported by 
archaeologists pointing to striking similarities in the archaeological 
record across western Eurasia during the third millennium Bc’*’*”’. 
Our genomic evidence for the spread of Yamnaya people from the 
Pontic-Caspian steppe to both northern Europe and Central Asia 
during the Early Bronze Age (Fig. 1) corresponds well with the 
hypothesized expansion of the Indo-European languages. In contrast 
to recent genetic findings’, however, we only find weak evidence for 
admixture in Yamnaya, and only when using Bronze Age Armenians 
and the Upper Palaeolithic Mal’ta as potential source populations 
(Z = —2.39; Supplementary Table 12). This could be due to the 
absence of eastern hunter-gatherers as potential source population 
for admixture in our data set. Modern Europeans show some genetic 
links to Mal’ta* that has been suggested to form a third European 
ancestral component (Ancestral North Eurasians (ANE))’®. Rather 
than a hypothetical ancient northern Eurasian group, our results 
reveal that ANE ancestry in Europe probably derives from the spread 
of the Yamnaya culture that distantly shares ancestry with Mal’ta 
(Figs 2b and 3b and Extended Data Fig. 3). 


Formation of Eurasian genetic structure 


It is clear from our autosomal, mitochondrial DNA and Y chro- 
mosome data (Extended Data Fig. 6) that the European and Central 
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individuals. The width of the bars representing ancient individuals is increased 
to aid visualization. Individuals with less than 20,000 SNPs have lighter colours. 
Coloured circles indicate corresponding group in the PCA. Shared ancestry of 
Mal’ta with Yamnaya (green component) and Okunevo (grey component) is 
indicated by dashed arrows. 


Asian gene pools towards the end of the Bronze Age mirror 
present-day Eurasian genetic structure to an extent not seen in 
the previous periods (Figs 2 and 3; Extended Data Fig. 1 and 
Supplementary Fig. 6). Our results imply that much of the basis 
of the Eurasian genetic landscape of today was formed during the 
complex patterns of expansions, admixture and replacements dur- 
ing this period. We find that many contemporary Eurasians show 
lower genetic differentiation (Fsr) with local Bronze Age groups 
than with earlier Mesolithic and Neolithic groups (Extended Data 
Figs 4 and 5). Notable exceptions are contemporary populations 
from southern Europe such as Sardinians and Sicilians, which 
show the lowest Fs; with Neolithic farmers. In general, the levels 
of differentiation between ancient groups from different temporal 
and cultural contexts are greater than those between contempor- 
ary Europeans. For example, we find pairwise Fsr = 0.08 between 
Mesolithic hunter-gatherers and Bronze Age individuals from 
Corded Ware, which is nearly as high as Fs; between contem- 
porary East Asians and Europeans (Extended Data Fig. 5). These 
results are indicative of significant temporal shifts in the gene 
pools and also reveal that the ancient groups of Eurasia were 
genetically more structured than contemporary populations. 
The diverged ancestral genomic components must then have dif- 
fused further after the Bronze Age through population growth, 
combined with continuing gene flow between populations, to 
generate the low differentiation observed in contemporary west 
Eurasians. 
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Figure 4 | Allele frequencies for putatively positively selected SNPs. 

a, Coloured circles indicate the observed frequency of the respective SNP in 
ancient and modern groups (1000 Genomes panel). The size of the circle is 
proportional to the number of samples for each SNP and population. b, Allele 
frequency of rs4988235 in the LCT (lactase) gene inferred from imputation of 
ancient individuals. Numbers indicate the total number of chromosomes for 
each group. BA, Bronze Age; IA, Iron Age. 


Temporal dynamics of selected SNPs 

The size of our data set allows us to investigate the temporal dynamics 
of 104 genetic variants associated with important phenotypic traits or 
putatively undergoing positive selection*’ (Supplementary Table 13). 
Focusing on four well-studied polymorphisms, we find that two single 
nucleotide polymorphisms (SNPs) associated with light skin pig- 
mentation in Europeans exhibit a rapid increase in allele frequency 
(Fig. 4). For rs1426654, the frequency of the derived allele increases 
from very low to fixation within a period of approximately 3,000 years 
between the Mesolithic and Bronze Age in Europe. For rs12913832, a 
major determinant of blue versus brown eyes in humans, our results 
indicate the presence of blue eyes already in Mesolithic hunter-gath- 
erers as previously described*’. We find it at intermediate frequency in 
Bronze Age Europeans, but it is notably absent from the Pontic- 
Caspian steppe populations, suggesting a high prevalence of brown 
eyes in these individuals (Fig. 4). The results for rs4988235, which is 
associated with lactose tolerance, were surprising. Although tolerance 
is high in present-day northern Europeans, we find it at most at low 
frequency in the Bronze Age (10% in Bronze Age Europeans; Fig. 4), 
indicating a more recent onset of positive selection than previously 
estimated™*. To further investigate its distribution, we imputed all 
SNPs in a 2 megabase (Mb) region around rs4988235 in all ancient 
individuals using the 1000 Genomes phase 3 data set as a reference 
panel, as previously described’*. Our results confirm a low frequency 
of rs4988235 in Europeans, with a derived allele frequency of 5% in 
the combined Bronze Age Europeans (genotype probability>0.85) 
(Fig. 4b). Among Bronze Age Europeans, the highest tolerance fre- 
quency was found in Corded Ware and the closely-related 
Scandinavian Bronze Age cultures (Extended Data Fig. 7). 
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Interestingly, the Bronze Age steppe cultures showed the highest 
derived allele frequency among ancient groups, in particular the 
Yamnaya (Extended Data Fig. 7), indicating a possible steppe origin 
of lactase tolerance. 


Implications 


It has been debated for decades if the major cultural changes that 
occurred during the Bronze Age resulted from the circulation of peo- 
ple or ideas and whether the expansion of Indo-European languages 
was concomitant with these shifts or occurred with the earlier spread 
of agriculture’*’>**°*. Our findings show that these transformations 
involved migrations, but of a different nature than previously sug- 
gested: the Yamnaya/Afanasievo movement was directional into 
Central Asia and the Altai-Sayan region and probably without much 
local infiltration, whereas the resulting Corded Ware culture in 
Europe was the result of admixture with the local Neolithic people. 
The enigmatic Sintashta culture near the Urals bears genetic resemb- 
lance to Corded Ware and was therefore likely to be an eastward 
migration into Asia. As this culture spread towards Altai it evolved 
into the Andronovo culture (Fig. 1), which was then gradually 
admixed and replaced by East Asian peoples that appear in the later 
cultures (Mezhovskaya and Karasuk). Our analyses support that 
migrations during the Early Bronze Age is a probable scenario for 
the spread of Indo-European languages, in line with reconstructions 
based on some archaeological and historical linguistic data’***. In the 
light of our results, the existence of the Afanasievo culture near Altai 
around 3000 Bc could also provide an explanation for the mysterious 
presence of one of the oldest Indo-European languages, Tocharian in 
the Tarim basin in China’’. It seems plausible that Afanasievo, with 
their genetic western (Yamnaya) origin, spoke an Indo-European 
language and could have introduced this southward to Xinjang and 
Tarim**. Importantly, however, although our results support a cor- 
respondence between cultural changes, migrations, and linguistic pat- 
terns, we caution that such relationships cannot always be expected 
but must be demonstrated case by case. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


DNA extraction and library preparation. A total of 603 human Bronze Age 
samples from across Eurasia were selected for initial molecular ‘screening’ to 
assess DNA preservation and hence the potential for genome-scale analyses. 
The samples consisted almost exclusively of teeth, but also a few bone and hair 
samples were included. All the molecular work (pre-library amplification) was 
conducted in dedicated ancient DNA clean laboratory facilities at the Centre for 
GeoGenetics, Natural History Museum, University of Copenhagen. 

Preferentially targeting the outer cementum layer in teeth rather than the 
dentine allowed us to maximize access to endogenous DNA**”’ (Supplemen- 
tary Information, section 3). The amount of starting material varied, but was 
generally 100-600 mg. We also added a ‘pre-digestion’ step to the extraction 
protocol, where the drilled bone or tooth powder is incubated in an EDTA-based 
buffer before complete digestion to facilitate the removal of surface contami- 
nants**° (Supplementary Information, section 3). Additionally, we developed 
a new DNA binding buffer for extraction that proved more efficient in recov- 
ering short DNA fragments compared to previous protocols (Supplementary 
Information, section 3). DNA libraries for sequencing were prepared using 
NEBNext DNA Sample Prep Master Mix Set 2 (E6070) and Illumina-specific 
adapters” following established protocols” *'. The libraries were ‘shot-gun’ 
sequenced in pools using Illumina HiSeq2500 platforms and 100-bp single-read 
chemistry (Supplementary Information, section 3). 

Molecular screening. For the molecular screening phase we generally generated 
between 5 and 20 million reads per library and these were used to evaluate the 
state of molecular preservation. Candidate samples were selected for further 
sequencing if they displayed a >10% C-T misincorporation damage signal in 
the 5’ ends as an indication of authentic ancient DNA*’, and a human DNA 
content >0.5% (Supplementary Information, section 3). 

Genomic capture. We selected 24 samples with relatively low human DNA 
content (0.5-1.1%) for a whole-genome capture experiment” to enrich for the 
low human DNA fraction in these samples. The capture was performed using 
the MYbait Human Whole Genome Capture Kit (MYcroarray, Ann Arbor, MI), 
following the manufacturer’s instructions (http://www.mycroarray.com/pdf/ 
MyYbaits-manual.pdf). After amplification, the libraries were purified using 
Agencourt AMPure XP beads, quantified using an Agilent 2100 bioanalyzer, 
pooled in equimolar amounts, and sequenced on Illumina HiSeq 2500, as described 
above. Methods and results are found in Supplementary Information, section 3. 
Bioinformatics. The Illumina data was basecalled using Illumina software 
CASAVA 1.8.2 and sequences were de-multiplexed with a requirement of full 
match of the 6 nucleotide index that was used for library preparation. Adaptor 
sequences and leading/trailing stretches of Ns were trimmed from the reads and 
additionally bases with quality 2 or less were removed using AdapterRemoval- 
1.5.4. Trimmed reads of at least 30 bp were mapped to the human reference 
genome build 37 using bwa-0.6.2 (ref. 44) with the seed disabled to allow for 
higher sensitivity’. Mapped reads were filtered for mapping quality 30 and sorted 
using Picard (http://picard.sourceforge.net) and SAMtools**. Data was merged to 
library level and duplicates removed using Picard MarkDuplicates (http://picard. 
sourceforge.net) and hereafter merged to sample level. Sample level BAMs were 
re-aligned using GATK-2.2-3 and hereafter had the md-tag updated and extended 
BAQs calculated using SAMtools calmd**. Read depth and coverage were deter- 
mined using pysam (http://code.google.com/p/pysam/) and BEDtools”. Statistics 
of the read data processing are shown in Supplementary Table 6. 

DNA authentication. DNA contamination can be problematic in samples 
from museum collections that may have been handled extensively. To secure 
authenticity, we used the Bayesian approach implemented in mapDamage 2.0 
(ref. 48) and recorded the following three key damage parameters for each sam- 
ple: (1) the frequency of CT transitions at the first position at the 5’ end of 
reads, (2) A, the fraction of bases positioned in single-stranded overhangs, and 
(3) 6s, the estimated C->T transition rate in the single-stranded overhangs 
(Supplementary Information, section 5). For further sequencing and down- 
stream analyses we only considered individuals displaying at least 10% C—>T 
damage transitions at position 1. MapDamage outputs are summarized in 
Supplementary Table 7. 

We also estimated the levels of mitochondrial DNA contamination. We used 
contamMix 1.0-10 (ref. 49) that generates a moment-based estimate of the error 
rate and a Bayesian-based estimate of the posterior probability of the contam- 
ination fraction. We conservatively removed individuals with indications of 
contamination >5% (Supplementary Information, section 5). For males with 
sufficient depth of coverage we also estimated contamination based on the X 
chromosome’ as implemented in ANGSD* (Supplementary Information, sec- 
tion 5). Results are shown in Supplementary Table 8. After implementing 
the 0.5% cut-off for human DNA content, combined with these ancient DNA 
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authentication criteria, our final sample consisted of 101 individuals (Supple- 
mentary Information, section 1). 

Data sets. We constructed two data sets for population genetic analysis by mer- 
ging ancient DNA data generated in this as well as previous studies with two 
reference panels of modern individual genotype data (Supplementary 
Information, section 6). For both data sets, genotypes for all ancient individuals 
were obtained at all variant positions in the reference panel, discarding variants 
where alleles for the ancient individuals did not match either of the alleles 
observed in the panel. Genotypes for low-coverage samples (including all data 
generated in this study) were obtained by randomly sampling a single read with 
both mapping and base quality = 30. Genotypes for high-coverage samples were 
called using the ‘call’ command of bcftools (https://github.com/samtools/ 
bcftools) and filtering for quality score (QUAL) = 30. Error rates and inclusion 
thresholds for low coverage samples were obtained by performing PCA and 
model-based clustering (described below) on subsampled data sets of higher 
coverage individuals. For population genetic analyses (D and f statistics, Fsr) 
we obtained sample allele frequencies for the ancient groups (Supplementary 
Table 9) at each SNP by counting the total number of alleles observed, treating 
the low coverage individuals as haploid. See Supplementary Information, section 
6 for more details. 

PCA and model-based clustering. We performed principal component analysis 
with EIGENSOFT™, projecting ancient individuals onto the components inferred 
from sets of modern individuals by using the ‘Isqproject’ option of smartpca. The 
data set was converted to all homozygous genotypes before the analysis, by 
randomly sampling an allele at each heterozygote genotype of modern and 
high-coverage ancient individuals. See Supplementary Information, section 6 
for more details. 

Model-based clustering analysis was carried out using the maximum-like- 
lihood approach implemented in ADMIXTURE™. We used an approach where 
we first infer the ancestral components using modern samples only, and then 
‘project’ the ancient samples onto the inferred components using the ancestral 
allele frequencies inferred by ADMIXTURE (the ‘P’ matrix). We ran 
ADMIXTURE on an LD-pruned data set of all 2,345 modern individuals in the 
Human Origins SNP array data set, assuming K = 2 to K = 20 ancestral compo- 
nents, selecting the best of 50 replicate runs for each value of K. See 
Supplementary Information, section 6 for more details. Genotypes where the 
ancient individuals showed the damage allele at C > T and G > A SNPs were 
excluded for each low coverage ancient individual. 

D- and f-statistics and population differentiation. We used the D and f statistic 
framework’ to investigate patterns of admixture and shared ancestry in our data 
set. All statistics were calculated from allele frequencies using the estimators 
described previously*’, with standard errors obtained from a block jackknife with 
5 Mb block size. We investigated population differentiation by estimating Fs; for 
all pairs of ancient and modern groups from allele frequencies using the sample- 
size corrected moment estimator of Weir and Hill", restricting the analysis to 
SNPs where a minimum two alleles were observed in each population of the pair. 
See Supplementary Information, section 6 for more details. 

Phenotypes and positive selection. To investigate the temporal dynamics of 
SNPs associated with phenotypes or putatively under positive selection, we esti- 
mated allele frequencies for a catalogue of 104 SNPs” in all ancient and modern 
groups in the 1000 Genomes data set. Genotypes for the LCT region were imputed 
from genotype likelihoods with the 1000 Genomes Phase 3 reference panel? 
using BEAGLE”. See Supplementary Information, section 6 for more details. 
Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized. The investigators were not blinded to 
allocation during experiments and outcome assessment. 

Code availability. Source code with R functions used in the analysis for this 
study is available as an R package at GitHub https://github.com/martinsikora/ 
admixr. 
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indicate modern populations, with lines corresponding to + 1 standard error of 
the respective D-statistic from block jacknife. Text away from the diagonal line 
indicates an ancient group with relative increase in allele sharing with the 

respective modern populations. 
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Extended Data Figure 6 | Distribution of uniparental lineages in Bronze Age Eurasians. a, b, Barplots showing the relative frequency of Y chromosome (a) and 
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Extended Data Figure 7 | Derived allele frequencies for lactase persistence in modern and ancient groups. Derived allele frequency of rs4988235 in the LCT 


gene inferred from imputation of ancient individuals. Numbers indicate the total number of chromosomes for each group. 
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Extended Data Table 1 | Selected D-test results from 1000 Genomes data set (panel B) 


Configuration D v4 Interpretation 
D(Yoruba,Neolithic Central)(Remedello, Hungary) -0.022 -6.2 Remedello is closer to Neolithic farmers than BA Hungarians 
D(Yoruba,Neolithic Central)(Hungary,Corded Ware) -0.014 -5.5 BA Hungarians are closer to Neolithic farmers than Corded Ware 
D(Yoruba,Neolithic Central)(Bell Beaker,Corded Ware) -0.009 -3.4 Bell Beaker is closer to Neolithic farmers than Corded Ware 
D(Yoruba,Neolithic Central)(Corded Ware,Yamnaya) -0.018 -7.0 Corded Ware is closer to Neolithic farmers than Yamnaya 
D(Yoruba,Neolithic Central)(Sintashta, Yamnaya) 0.014 -4.9 Sintashta is closer to Neolithic farmers than Yamnaya 
D(Yoruba, Yamnaya) (Hungary, Bell Beaker) 0.011 45 Bell Beaker is closer to Yamnaya than BA Hungarians 
D(Yoruba, Yamnaya)(Hungary,Corded Ware) 0.016 6.7 Corded Ware is closer to Yamnaya than BA Hungarians 
D(Yoruba, Yamnaya) (Hungary, Sintashta) 0.011 4.8 Sintashta is closer to Yamnaya than BA Hungarians 
D(Yoruba,Armenia)(Yamnaya,Corded Ware) 0.002 0.9 
D(Yoruba,Corded Ware)(Yamnaya, Armenia) 0.015 -58 Corded Ware and Yamnaya form a clade to the exclusion of BA Armenians 
D(Yoruba, Yamnaya)(Armenia,Corded Ware) 0.018 6.8 
D(Yoruba, Yamnaya)(Afanasievo,Karasuk) -0.047 -17.7 
D(Yoruba, Afanasievo) (Yamnaya,Karasuk) -0.038 -16.0 Yamnaya and Afanasievo form a clade to the exclusion of other ancient groups* 
D(Yoruba,Karasuk)(Yamnaya,Afanasievo) 0.008 3.6 


*Results are shown for Karasuk as group X, which is the only ancient group with Z > 3 for D(Yoruba, X)(Yamnaya, Afanasievo) 
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Grau Jf, outgroup* Jf, admixture* 
Population with highest /, outgroup i SE Example population pair with /, <0 th Z 
Ust-Ishim Surui 0.240 0.003 
Kostenki* Hunter-gatherer W 0.266 0.003 
Afontova Gora Mal'ta 0.317 0.005 
Mal'ta* Afontova Gora 0.317 0.005 
Hunter-gatherer W Hunter-gatherer Scandinavia 0.313 0.002 
Hunter-gatherer Scandinavia |Hunter-gatherer W 0.313 0.002 |(Hunter-gatherer W,Mal'ta) § -0.005 -1.78 
Neolithic Hungary Neolithic Central 0.287 0.002 |(Hunter-gatherer W,Neolithic Central)’ -0.002—_-1.09 
Neolithic Central Neolithic Hungary 0.287 0.002 
Neolithic Scandinavia* Neolithic Central 0.285 0.003 
Remedello Neolithic Scandinavia 0.284 0.003 
BA Hungary Hunter-gatherer W 0.284 0.002 |(Hunter-gatherer W,BA Armenia) -0.011 -8.69 
Bell Beaker Hunter-gatherer W 0.278 0.002 |(Remedello, Yamnaya) -0.007 -4.31 
Corded Ware Hunter-gatherer W 0.279 0.002 |(Neolithic Hungary, Yamnaya) -0.009 -6.16 
Unetice Hunter-gatherer W 0.278 0.002 |(Neolithic Central,Afanasievo) -0.008 -6.07 
BA Scandinavia Hunter-gatherer W 0.282 0.002 |(Hunter-gatherer W,BA Armenia) -0.013  -9.61 
BA Baltic Hunter-gatherer Scandinavia 0.276 0.008 
BA Montenegro" Scottish 0.275 0.005 
BAArmenia IA Armenia 0.270 0.005 |(Neolithic Hungary,Okunevo) -0.006 -2.66 
Yamnaya Afanasievo 0.284 0.002 |(Neolithic Central,Mal'ta)s -0.004 -2.32 
Afanasievo Yamnaya 0.284 0.002 
Stalingrad quarry+ BA Afontova Gora 0.287 0.007 
Okunevo Karitiana 0.282 0.003 
Sintashta Andronovo 0.278 0.002 |(Neolithic Central,Okunevo)s -0.004 -2.43 
Andronovo Hunter-gatherer Scandinavia 0.280 0.002 |(Remedello,Mal'ta) -0.005 -5.13 
Mezhovskaya Andronovo 0.277 0.002 |(Sintashta,HanB) -0.012 -5.62 
Karasuk BA Afontova Gora 0.276 0.003 |(Andronovo,HanB) -0.022 -27.49 
BA Afontova Gora Nganasan 0.282 0.003 |(Remedello,Okunevo)s -0.002 -0.15 
IA Altai BA Afontova Gora 0.274 0.004 |(Afanasievo,HanB) -0.017 -10.88 
IA Hungary? Lithuanian 0.273 0.002 
IA Scandinavia* Hunter-gatherer Scandinavia 0.278 0.003 
IA Armenia‘ Bell Beaker 0.275 0.004 
IA Russia? Nganasan 0.285 0.003 


*Human origins data set (panel A); +1000 Genomes data set (panel B); t{group with single individual; Spair with lowest fz reported for groups with negative fz without significant Z-score after correcting for multiple 
hypothesis tests (—4.1 < min(Z < 0; 1,260 tests per group); ||too few markers with data from more than one chromosome. 
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Cloning and variation of ground state 


intestinal stem cells 


Xia Wang'*, Yusuke Yamamoto!*, Lane H. Wilson’, Ting Zhang®*, Brooke E. Howitt*, Melissa A. Farrow”, Florian Kern’, 
Gang Ning’, Yue Hong', Chiea Chuen Khor*°, Benoit Chevalier’, Denis Bertrand*, Lingyan Wu’, Niranjan Nagarajan’, 
Francisco A. Sylvester’, Jeffrey S. Hyams*, Thomas Devers’, Roderick Bronson!?, D. Borden Lacy®, Khek Yu Ho", 


Christopher P. Crum‘, Frank McKeon?! & Wa Xianb?41)2 


Stem cells of the gastrointestinal tract, pancreas, liver and other columnar epithelia collectively resist cloning in their 
elemental states. Here we demonstrate the cloning and propagation of highly clonogenic, ‘ground state’ stem cells of the 
human intestine and colon. We show that derived stem-cell pedigrees sustain limited copy number and sequence 
variation despite extensive serial passaging and display exquisitely precise, cell-autonomous commitment to 
epithelial differentiation consistent with their origins along the intestinal tract. This developmentally patterned and 
epigenetically maintained commitment of stem cells is likely to enforce the functional specificity of the adult intestinal 
tract. Using clonally derived colonic epithelia, we show that toxins A or B of the enteric pathogen Clostridium difficile 
recapitulate the salient features of pseudomembranous colitis. The stability of the epigenetic commitment programs of 
these stem cells, coupled with their unlimited replicative expansion and maintained clonogenicity, suggests certain 
advantages for their use in disease modelling and regenerative medicine. 


While dominating prospective strategies for regenerative medicine, 
embryonic stem cells and induced pluripotent stem cells (iPSCs) face 
formidable challenges including risk of teratoma, complex guiding 
protocols for lineage specificity, and limited regenerative capacity of 
the lineages ultimately produced’ *. The success and promise of iPSCs 
have largely overshadowed efforts to harness stem cells intrinsic to 
regenerative tissues. Green and colleagues developed methods for 
cloning epidermal stem cells’ that form a stratified epithelium upon 
engraftment, and these methods have been successfully applied to 
corneal, thymic and airway epithelia'’’’. However, stem cells of 
columnar epithelial tissues resist cloning in a manner that maintains 
their immaturity during proliferative expansion, and instead must be 
carried forward as regenerative, differentiating ‘organoids’. 
Despite their obvious potential in regenerative medicine and constant 
improvement”, the very low percentage of clonogenic cells in orga- 
noids limits the kinetics of their propagation as well as their utility for 
exploring the elemental stem cell. 

The present study reports the cloning and propagation of 
‘ground state’ human intestinal stem cells (ISC°°). This technology 
offers insights into the molecular and functional features of colum- 
nar epithelial stem cells and their utility for disease modelling and 
regenerative medicine. 


Cloning human fetal intestinal stem cells 

We developed media (herein SCM-6F8) containing novel combina- 
tions of growth factors and regulators of TGF-B/BMP (transforming 
growth factor-B/bone morphogenetic protein), Wnt/B-catenin, EGF 
(epidermal growth factor), IGF (insulin-like growth factor) and Notch 


pathways’*°”! that supports the maintenance of human intestinal stem 
cells in a highly clonogenic, ground state form. Thus single-cell suspen- 
sions of intestinal epithelia derived from 20- to 21-week-old fetal 
demise cases yield colonies comprised of highly immature cells in 
which differentiation markers can be induced by Notch suppression 
(Fig. la). Following induced differentiation via Wnt withdrawal, 
we were unable to recover ground state stem cells by our methods 
(Extended Data Fig. la-c). 

The clonogenicity of cells in the colonies was determined by single- 
cell transfer to be greater than 50% (Fig. 1b). This high clonogenicity 
permits the rapid generation of single-cell ‘pedigree’ lines for expan- 
sion and characterization of lineage fates upon differentiation” 
(Fig. 1b). Pedigree lines of ISC°* and tracheobronchial stem cells 
(TBSC°)2 grown for several months in culture were differentiated 
in air-liquid interface (ALI) cultures for 10-30 days (Fig. 1c). The 
ISC° formed a highly uniform, 3D serpentine pattern, whereas 
TBSC®* produced a stratified epithelium with apically positioned 
ciliated and goblet cells. Histological sections of differentiated ISC°* 
revealed a columnar epithelium of villus-like structures marked by 
goblet (Muc2*), endocrine (chromogranin A‘), and Paneth cells and 
polarized villin expression (Fig. 1d; Extended Data Fig. 1d), indicating 
that the progeny of a single ISC“* can give rise to all epithelial lineages 
typically found in the small intestine. Importantly, differentiation 
of these ground state stem cells is accomplished by exposure to 
an ALI rather than a removal of factors such as Wnt that maintain 
immaturity. 

While principal component analysis (PCA) of differentially 
expressed genes of ground state stem cells and ALI-differentiated 


1The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA. Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, 
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Massachusetts 02118, USA. *Department of Pathology, Microbiology, and Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA. Department of Ophthalmology, Yong 
Loo Lin School of Medicine, National University of Singapore, 119228 Singapore. Department of Pediatrics, Division of Gastroenterology, The University of North Carolina at Chapel Hill, Chapel Hill, North 
Carolina 27599, USA. 8Division of Digestive Diseases, Hepatology, and Nutrition, Connecticut Children’s Medical Center, Hartford, Connecticut 06106, USA. °Department of Medicine, University of 
Connecticut Health Center, Farmington, Connecticut 06032, USA. !°Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA. !!Department of 


Medicine, National University of Singapore, 119228 Singapore. !*Multiclonal Therapeutics, Inc., Farmington, Connecticut 06032, USA. 
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Figure 1 | Cloning stem cells from fetal intestine. a, Left, Sox9 expression in 
fetal intestine, scale bar, 25 um; colonies from intestine (n = 10 biological 
replicates; colonies of ISC pedigree (n = 30 independent experiments). Scale 
bar, 75 jum. Right, ISC colonies stained with indicated antibodies. n = 4 
technical replicates. Bottom, marker expression following Notch inhibition. 
n = 4 technical replicates. b, Left, ISC colony growth. Scale bar, 75 jum. 
Right, clonogenicity of colony cells. n = 3 biological replicates. c, ISC and TBSC 
pedigrees and ALI differentiation (tubulin, green; Muc5AC, red). Scale bar, 
50 tum left, 25 tum right top, 25 um bottom right; n = 7 biological replicates; 
n = 3 technical replicates; 3 independent experiments. d, ALI-differentiated 
ISC. Scale bar, 50 tm. n = 7 biological replicates; n = 3 technical replicates; 
3 independent experiments. H&E, haematoxylin and eosin staining. e, PCA 
using 2,158 genes (> twofold, P < 0.05 by Student’s t-test) of ISC and TBSC 
and corresponding ALI-differentiated epithelia. f, Expression heat map of 
markers in ISC and TBSC. Scale, —2.5-fold (extreme blue) to +2.5-fold 
(extreme red). n = 3 technical replicates. 


tissue showed great divergence as expected for columnar and strati- 
fied epithelia, the gene expression profiles of undifferentiated ISC°* 
and TBSC°S differed by less than 4% (>2.0-fold, P < 0.05) (Fig. le). 
ISC°° showed high expression of intestinal stem-cell markers such as 
OLFM4, CD133 (ref. 22), Lgr5 (ref. 23) and Lrig] (ref. 24), whereas 
those from the airways had the typical stem cells markers of stratified 
epithelia (Krt14, Krt5 and Tp63 (ref. 11)) (Fig. 1f). 


Intestinal stem cell variation 

Approximately one, in 2,000 cells from duodenum (I*"SC), jejunum 
(P°SC) and ileum (I''SC) of a 21-week-old fetal intestine form a colony 
(Fig. 2a). Although these colonies were morphologically indistin- 
guishable in culture, whole-genome expression analysis of multiple 
pedigrees showed a consistent, region-specific signature of 24-178 
genes (>1.5-fold, P < 0.05; Fig. 2b; Extended Data Fig. 2a). 

After 10 days at an ALI, I™SC and Ae . gave rise to a finer pattern 
of epithelial folds than that produced by I''SC (Fig. 2c). By histology, 
villi appear progressively more robust along the anterior—posterior 
axis, with I"SC producing the larger villi and more numerous goblet 
cells (Fig. 2d, e). Interestingly, the epithelia derived from I“SC 
expressed markers more typical of gastric epithelium (for example, 
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Figure 2 | Stem cells from fetal small intestine. a, Depiction of small intestine 
and clones derived from each. Scale bar, 400 um; n = 3 biological replicates. 
b, Heat map of pedigrees from duodenum (Duo), jejunum (Jej), and ileum (Ile). 
c, Surface views of ALI cultures. Scale bar, 200 um; n = 30 technical replicates. 
d, e, Histological sections through ALI cultures at low (scale bar, 150 jm) and 
high (scale bar, 50 um) magnification. f, Immunofluorescence on sections of 
ALI cultures with indicated antibodies. ECAD, E-cadherin. Scale bar, 75 um; 
n = 3 technical replicates. g, PCA map of stem cell gene expression from the 
three major regions of the small intestine together with their corresponding 
ALI-differentiated epithelia. 


TFF2 and Muc5AC), consistent with duodenum’s location between 
the stomach and the small intestine (Extended Data Fig. 2b). 
P*SC-derived epithelium, however, expressed Muc2, consistent with 
intestinal epithelium (Extended Data Fig. 2c), and I''SC produced 
an epithelium more akin to colon (Fig. 2f). The pattern of prolifera- 
tion in the ALI epithelia as measured by Ki67 staining was generally 
confined to cells proximal to the support membrane (Fig. 2e, f). PCA 
mapping of gene expression revealed more divergence among ALI- 
differentiated tissue than among the intestinal stem cells (Fig. 2g). 


Colon stem cells 

We also generated single-cell pedigree lines from the ascending, 
transverse, and descending colon from the same 21-week fetal demise 
case (Fig. 3a). The variation in gene expression between the stem cells 
of these colonic segments was minimal with signatures of 19-28 genes 
(>1.5-fold, P < 0.05; Fig. 3b). As with pedigrees derived from the 
intestinal epithelium, those from the colon could be propagated for 
months without loss of clonogenicity (not shown). Differentiation of 
these colon pedigrees under identical ALI conditions employed for 
the intestinal stem cells resulted in networks of 3D, large-diameter 
structures (Fig. 3c). Consistently, the histology of these ALI cultures 
revealed patterns of broad intestinal glands dominated by goblet cells 
(Fig. 3d). These ALI-generated tissues showed strong staining for 
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Figure 3 | Stem cells of fetal colon. a, Depiction of colon and clones derived 
from each. Scale bar, 75 um; n = 3 biological replicates. AC, TC and DC, 
ascending, transverse and descending colon, respectively. b, Expression heat 
map of pedigrees from the three major divisions of the colon. c, Surface images 
of ALI cultures. Scale bar, 100 um; n = 20 technical replicates. d, Histological 
sections through ALI cultures of colon stem cells. Scale bar, 75 jum. 


intestinal goblet cell marker Muc2, as well as polarized villin 
and Krt20, typical of differentiated colonic epithelium (Fig. 3e). 
And while the colonic stem cells as a group showed minor differences 
in gene expression (see Figure 3b), they gave rise to epithelia with 
more distinct gene expression profiles (Extended Data Fig. 3). PCA 
mapping of these expression data showed a clustering of the colon 
stem cells relative to the intestinal stem cells, with increasingly 
distant spaces occupied by stem cells of the ileum, jejunum and duo- 
denum, respectively (Fig. 3f). This distinction in global gene express- 
ion patterns is reflected, for instance, in the differential expression 
of transcription factors. In particular, ONECUT2, NROB2, TRPS1 
and ZNF503 show relatively high expression in the small intestine 
stem cells, whereas those of the colon showed a bias for Hox genes 
as well as the global chromatin organizer genes SATBI and SATB2 
(Fig. 3g, h). 
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Figure 4 | Differential gene expression in stem cells of stratified and 
columnar epithelia. a, PCA map of stem cells of stratified epithelia (cornea, 
corneal epithelium; MG, mammary gland; PG, prostate gland; TBSC, tracheo- 
bronchial epithelial stem cells) and columnar epithelia (FTSC, fallopian tube 
epithelium). b, Gene expression in stem cells (stratified epithelia n = 3 technical 
replicates; columnar epithelia n = 2 technical replicates). c, Transcription 
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e, Immunofluorescence on sections through ALI cultures with indicated 
antibodies. Scale bar, 50 um. f, PCA map of gene expression of colon and 
intestine stem cells. g, Expression heat map of stem cells of small intestine and 
colon. h, PCA map of gene expression profiles of intestinal stem cells and 
their corresponding ALI-differentiated epithelia. 


Columnar versus stratified epithelia 


The expression profiles of stem cells of human intestinal tract enabled 
a detailed comparison with those of stratified epithelia including 
human epidermis, corneal epithelium, mammary gland, prostate 
gland and upper airway. From this analysis it is clear that stratified 
epithelia, all of which depend on the p53-related stem-cell marker p63 
for long-term self-renewal'', occupy a distinct expression space from 
that of the intestinal stem cells or other columnar epithelial stem cells 
(Fig. 4a). A survey of genes whose expression is associated with stem 
cells of one of these two major classes of epithelia revealed a strong 
bias for Olfm4, CD133 (ref. 22), Lgr5 (ref. 23), Nr5a2 (ref. 25), Id2, 
Lrigl (ref. 24), EphB2, Ascl2 and EphB3 in the intestinal stem cells, 
while the stratified epithelial stem cells expressed ZNF750, TP63 and 
KRTS5 (Fig. 4b). Many of the markers differentially appearing in the 
intestinal stem cells, such as Olfm4, Lgr5 and Ascl2, are not general 
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factors differentially expressed in TBSC and ISC. d, ALI-differentiated adult 
terminal ileum stem cells derived from endoscopic biopsy. Scale bar, 50 [um; 
n = 10 technical replicates. e, PCA map of stem cells of adult terminal ileum, 
colon, fetal ISCs, and stratified epithelia. f, Stem cell markers in adult terminal 
ileum stem cells and TBSCs. 
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Figure 5 | Genomic stability of ISC in culture. a, Clone selection for pedigree 
generation. Scale bar, 200 um. b, Serial passaging of pedigrees. c, CNV, BAF 
(B allele frequency) and LRR (log R ratio) profiles of pedigrees at P5 to P20 
and trisomy 12 indicated (circle). d, ALI-differentiated pedigree 2 at P7, P17, 
and P27 stained with H&E (top), Alcian blue (middle), and periodic acid Schiff 
(bottom). Scale bar, 100 um; m = 4 technical replicates. e, Clonogenicity assay 
revealing Rhodamine red-stained colonies grown 20 days following seeding 
1,000 passaged cells. Scale bar, 10mm; n = 3 technical replicates. 

f, Quantification of clonogenicity at indicated passage number of ground state 
stem cells from jejunum (PSC) and ileum (I'°SC). n = 3 biological replicates; 
error bars, s.d. 


columnar epithelial stem cell markers as evidenced by their absence in 
fallopian tube stem cells, although Lrig] is more highly expressed in 
fallopian tube stem cells than either those of the intestine or the colon 
(Extended Data Fig. 4a). Notably, Bmil, a member of the Polycomb 
group (PcG) PRC1-like complex implicated in self-renewal in both 
haematopoietic” and as reserve cells for proliferating, Lgr5~ intest- 
inal stem cells*””°, was not differentially expressed in the cloned 
intestinal versus stratified epithelial stem cells. And while many of 
the typical markers of intestinal stem cells such as Lgr5, CD44, Lrig1, 
EphB2 and ASCL2 show a decrease in expression as the intestinal 
stem cells are differentiated in ALI cultures, Bmil did not 
(Extended Data Fig. 4b, c). These findings suggest that we are cloning 
either crypt cells or so-called ‘+4’ cells that have become crypt-like in 
their expression patterns. We also examined transcription factors 
differentially expressed in ISC compared to stratified epithelial stem 
cells in an effort to understand the regiospecificity of commitment 
programs of stem cells along the intestinal tract (Fig. 4c). In addition 
to six transcription factors that were uniformly highly expressed in 
stem cells of the intestinal tract (CREB3L1, Myb, NR5A2, IRF8, 
HNF4G and Msx2) versus tracheobronchial stem cells, this analysis 
revealed limited sets of transcription factors differentially expressed 
in stem cells along the anterior—posterior axis of the intestinal tract 
that conceivably function in maintaining commitment states. For 
instance, and consistent with previous observations”, GATA4 and 
GATA6 were expressed most strongly in the anterior portions of the 
intestinal tract (Fig. 4c). Significantly, the selective deletion of GATA4 
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and GATA6 in the murine duodenum and jejunum promotes ileal 
properties and a detrimental phenotype**”’, suggesting a role for these 
transcription factors in maintaining segmental identity acting at the 
level of the stem cell. Similarly, the requirement for Onecut2 in 
the duodenum” might be at the level of the duodenal stem cells. It 
is likely that analyses of cloned stem cells from the various segments of 
the intestinal tract will help to unravel the roles of such segment- 
specific transcription factors in the establishment of commitment 
and differentiation programs. Importantly, the overall properties of 
ISCs from fetal sources are conserved in those derived from endo- 
scopic biopsies of paediatric and adult cases (Fig. 4d-f). 


Genomic and lineage stability 


Human embryonic stem cells and iPSC lines acquire with successive 
passages genomic structural variations, including some that confer a 
selective advantage***. To assess the genomic stability of our ISC°, 

we examined copy number (CNV) and single nucleotide variation 
(SNV) in two independent Isces pedigrees derived from the ileum 
of one fetal demise case after 50 (passage 5; P5), 100 (P10), 150 (P15), 

and 200 days (P20) of continuous proliferation (Fig. 5a, b). At P5, 

when single ISC°° pedigrees can be amplified to an estimated 300 
million to 75 billion cells, no chromosomal aneuploidies were 
detected, although one pedigree showed three interstitial deletions 
affecting two genes (Fig. 5c; Extended Data Fig. 5a; Supplementary 
Information Table 1). This low level of structural variation was main- 
tained though passage 10, although increased by P15 and at P20 one of 
the pedigrees showed a frank trisomy of chromosome 12 (Fig. 5c; 
Extended Data Fig. 5a; Supplementary Information Table 1). A sim- 
ilar upward trend in CNV as a function of passage number was 
observed in five intestinal pedigrees (pedigrees 3-7) derived from a 
separate fetal demise case (Extended Data Figs 5, 6; Supplementary 
Information Tables 1, 2). 

By exome sequencing, our original two pedigrees showed few (0-1) 
non-synonymous mutations through passage 10, and these increased 
modestly (1-2 new non-synonymous mutations) through P15 and 
P20 (Extended Data Fig. 5a). None of these non-synonymous muta- 
tions have been reported as driver genes in human cancers. A similar 
trend was observed in the five pedigrees from the second fetal demise 
case followed through P5 and P25. By P25 the range of non-synonym- 
ous SNVs increased to 2-10 per clone, and while not involving obvi- 
ous cancer driver genes, did include genes such as ECT2L and EP300 
that might provide a selective growth advantage (Extended Data 
Fig. 5c). These data indicate that most pedigrees sustain few genomic 
changes within the first 100 days of proliferative expansion. By P15 
and through P25, however, half the pedigrees showed evidence for 
aneuploidy as well as an increase in interstitial CNV and SNVs with 
allele frequencies nearing 0.5, suggesting the rise of an advantaged 
subclone. We asked how these late-passage genomic changes might 
affect differentiation by comparing early and late passages of pedigree 
2 in ALI differentiation. By all histological criteria, including Alcian 
blue staining for goblet cells and intestinal marker staining, we could 
not distinguish the ALI-differentiated epithelia derived from P7, P17 
and P27 (Fig. 5d; Extended Data Fig. 7). Similarly, we note that these 
intestinal stem cell pedigrees do not lose (or gain) clonogenicity when 
tested at P7 and P16, which remain stably above 50% (Fig. 5e, f). 
Lastly, we found no evidence of tumorigenicity by these ground state 
intestinal stem cells, including those at P25 harbouring aneuploidies, 
following their subcutaneous implantation to immunodeficient 
(NOD.Cg-Prkde*“* Harg'™“"/SzJ) mice*> (Extended Data Fig. 8). 


Modelling Clostridium difficile infections 


C. difficile is a Gram-positive, spore-forming bacterium and the 
primary cause of nosocomial diarrhoea and pseudomembranous col- 
itis**. The pathogenicity of C. difficile is linked to its production of two 
similar, high molecular weight toxins TcdA and TcdB. While together 
TcdA and TcdB cause fluid secretion, inflammation, and colonic 
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Figure 6 | C. difficile toxin B effects on in vitro-generated colonic epithelia. 
a, TcdB effects on colonic stem cell-derived epithelia. Scale bar, 100 um. n = 4 
technical replicates. b, Tight junction protein claudin 3 (CLDN3; red) and 
adherens junction marker cadherin-17 (CDH17; green) in ALI colonic 
epithelium treated with TcdB. Scale bar, 50 1m; n = 4 biological replicates. 
c, Dextran permeability assay on TcdB-treated ALI colonic epithelia. d, 3D plot 


tissue damage, their respective and possible synergistic roles have 
been difficult to ascertain*”*’. We therefore challenged colonic epi- 
thelia derived from cloned, ground state colonic stem cells with 
recombinant TcdB (Fig. 6a, b; Extended Data Fig. 9a, b). At higher 
concentrations or longer time points there is a loss of goblet cells, 
disruption of the crypt architecture, cell polarity, and a specific loss of 
tight versus adherens junction proteins that correlates with increased 
dextran permeability (Fig. 6c). These dose-response changes in the 
ALI colonic epithelium mirror those of C. difficile-associated pseudo- 
membranous colitis (Fig. 6d, Extended Data Fig. 9a, b). Microarray 
analysis of ALI-generated colonic epithelia following nine TcdB treat- 
ment conditions revealed alterations in gene expression in a time- and 
dose-dependent manner (Fig. 6e, f; Extended Data Fig. 9c—f). Pathway 
analysis indicated that TcdB triggers changes in gene expression 
related to inflammation, RhoB-mediated actin regulation, and junc- 
tional dynamics previously implicated in C. difficile pathology***’. In 
addition, this analysis revealed that DUOX2 and DUOXA2 were con- 
sistently the two highest upregulated genes (Fig. 6e, f). These proteins 
form an enzyme capable of producing hydrogen peroxide and have 
been implicated in the inflammation of inflammatory bowel disease 
(IBD)”. Finally, we also tested C. difficile TcdA in our model. TcdA is 
reported to be a specific enterotoxin**”’, and indeed we found that it 
triggers similar cytopathic and permeability changes in ALI models of 
human colonic epithelium (Extended Data Fig. 10), albeit at lower 
doses than those effective for TcdB. Together these findings under- 
score the potential of this model system to recapitulate and elucidate 
C. difficile pathology. 


Discussion 


Adult stem cells of the highly regenerative intestinal tract remain 
largely defined by metabolic, marker profiling, or lineage tracing 
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0-3 rating for colonic epithelial integrity. e, Heat map of 39 genes differentially 
expressed between TcdB (500 pM, 24h) and controls (> threefold and P < 0.05 
by Student’s t-test). f, 3D plot of seven selected genes at time points and doses 
indicated. n = 2 technical replicates. 


experiments in vivo or transplantation of cells from intestinal 
organoids******, As stem cells comprise only a minor component 
of organoids—perhaps less than 1%*°—the molecular features of 
stem cells of columnar epithelia such as the intestinal tract have 
remained unclear. Therefore the selective cloning and proliferative 
expansion of highly clonogenic, ground state intestinal stem cells 
described here offers a first glimpse into the molecular properties of 
these cells. Our inability to convert differentiated cells to clonogenic 
cells supports the notion that we are cloning resident stem cells 
rather than somehow ‘reprograming’ differentiated enterocytes. 
These resident stem cells possess robust epigenetic programs of 
commitment to regiospecific intestinal differentiation that are stable 
despite more than six months of continuous propagation. This cell- 
autonomous regiospecificity of stem cells along the intestinal tract 
argues against a unitary ‘intestinal stem cell’ or even one each for the 
histologically recognized segments, but rather a developmentally 
established spectrum of stem cells that ultimately maintains the his- 
tological and functional properties that define these segments. A heur- 
istic deciphering of the commitment code from the regiospecific 
expression patterns presented here will guide parallel efforts with 
iPSCs to achieve appropriate lineage fates**. Interestingly, many 
inductive signalling pathways and transcription factors implicated 
in embryonic gut formation” may act to reinforce commitment codes 
via continued expression in stem cells of the intestinal tract. 

We anticipate that the ability to maintain these stem cells in their 
elemental state will enable the discovery of epigenetic mechanisms 
that underlie properties of very long-term self-renewal, exquisitely 
precise lineage commitment, and the intrinsically directed, self- 
assembly of differentiated epithelia. Although we demonstrate the 
potential of clonally-derived colonic epithelia to model the pathogen- 
esis of C. difficile toxins, we anticipate the need to restore complexity 
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in the form of mesenchyme, immune cells, enteric neurons and per- 
haps components of the microbiome* to fully recapitulate disease 
dynamics. In particular, enteric maladies such as inflammatory bowel 
disease represent important medical challenges whose aetiologies 
most likely reside in interactions between the immune system, intest- 
inal mucosa and intestinal flora*°°. Finally, the ability to clone 
patient-specific, ground state stem cells from columnar epithelia via 
endoscopic biopsies, coupled with their orders-of-magnitude expan- 
sion kinetics over organoids, favours their use in regenerative medi- 
cine, pre-clinical trials and disease modelling. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


In vitro culture of human small intestinal and colonic epithelial stem cells. 
Intestinal tissue from 20- to 21-week-old late fetal demise cases were obtained 
under parent consent as de-identified material under approved institutional 
review board protocols at the Brigham and Women’s Hospital, Boston, MA, 
USA (2009P002281). Terminal ileum endoscopic biopsies were obtained under 
informed consent and institutional review board approval at the Connecticut 
Children’s Medical Center, Hartford, Connecticut USA (15-047J-2). Fetal intest- 
inal tissue or 1 mm endoscopic biopsies from terminal ileum were collected into 
cold F12 media (Gibco, USA) with 5% fetal bovine serum (HyClone, USA) and 
then minced by sterile scalpel into 0.2-0.5 mm’ sizes to a viscous and homogen- 
eous appearance. The minced tissue was digested in 2 mg ml ' collagenase type IV 
(Gibco, USA) at 37 °C for 30-60 min with agitation. Dissociated cells were passed 
through a 70-j1m Nylon mesh (Falcon, USA) to remove aggregates and then were 
washed four times in cold F12 media, and then seeded onto a feeder layer of 
lethally irradiated 3T3-J2 cells”’* in c-FAD media’ modified to SCM-6F8 media 
by the addition of 125 ng ml"! R-spondin1 (R&D systems, USA), 1 1M Jagged-1 
(AnaSpec Inc., USA), 100 ng ml? human Noggin (Peprotech, USA), 2.5 uM 
Rock-inhibitor (Calbiochem, USA), 2 uM SB431542 (Cayman chemical, USA), 
and 10 mM nicotinamide (Sigma-Aldrich, USA). Cells were cultured at 37 °C ina 
7.5% CQOz incubator. The culture media was replaced every two days. Colonies 
were digested by 0.25% trypsin-EDTA solution (Gibco, USA) for 5-8 min and 
passaged every 7 to 10 days. To obtain single-cell suspensions colonies were 
trypsinized by TrypLE Express solution (Gibco, USA) for 8-15 min at 37°C 
and cell suspensions were passed through 30-j1m filters (Miltenyi Biotec, 
Germany). Approximately 20,000 epithelial cells were seeded to each well of 
6-well plate. Cloning cylinder (Pyrex, USA) and high vacuum grease (Dow 
Corning, USA) were used to select single colonies for pedigrees. Gene expression 
analyses were performed on cells derived from passage 4-8 (P4-P8) cultures. 
Histology and immunostaining. Histology, haematoxylin and eosin (H&E), 
Alcian blue, periodic acid-Schiff (PAS), rhodamine B staining, immunohisto- 
chemistry, and immunofluorescence were performed using standard techniques. 
For immunofluorescence and immunohistochemistry, 4% paraformaldehyde- 
fixed, paraffin-embedded tissue sections were subjected to antigen retrieval in 
citrate buffer (pH 6.0, Sigma-Aldrich, USA) at 120 °C for 20 min, and a blocking 
procedure was performed with 5% bovine serum albumin (BSA, Sigma-Aldrich, 
USA) and 0.05% Triton X-100 (Sigma-Aldrich, USA) in phosphate-buffered 
saline (PBS; Gibco, USA) at room temperature for 1 h. Primary antibodies used 
in this study and staining condition were listed in Supplementary Information 
Table 3. All images were captured by using the Inverted Eclipse Ti-Series (Nikon, 
Japan) microscope with Lumencor SOLA light engine and Andor Technology 
Clara Interline CCD camera and NIS-Elements Advanced Research v.4.13 soft- 
ware (Nikon, Japan) or LSM 780 confocal microscope (Carl Zeiss, Germany) with 
LSM software. Bright field cell culture images were obtained on an Eclipse TS100 
microscope (Nikon, Japan) with Digital Sight DSFilcamera (Nikon, Japan) and 
NIS-Elements F3.0 software (Nikon, Japan). 

Stem cell differentiation. Air—liquid interface (ALI) culture of TBSCs was per- 
formed as described’**". Briefly, for ALI culture of intestinal and colonic epithelial 
stem cells, Transwell inserts (Corning, USA) were coated with 20% Matrigel (BD 
Biosciences, USA) and incubated at 37°C for 30 min to polymerize. 200,000 
irradiated 3T3-J2 cells were seeded to each transwell insert and incubated at 
37°C, 7.5% CO, incubator overnight. QuadroMACS Starting Kit (LS) 
(Miltenyi Biotec, Germany) was used to purify the stem cells by removal of feeder 
cells. 200,000-300,000 stem cells were seeded into each Transwell insert and 
cultured with SCM-6F8. At confluency (3-7 days), the apical media was removed 
through careful pipetting and the cultures were continued for an additional 6-12 
days before analysis. 

Clostridium difficile toxin treatment and epithelial permeability assay. 
Clostridium difficile toxins A and B (TcdA, TcdB) were prepared as described*. 
Intestinal stem cells were differentiated in air-liquid interface cultures as 
described above and treated with 100, 250, 500 pM and 10 nM TcdA or TcdB 
for 0, 8, 16, and 24 h). At these time points, membranes with differentiated 
epithelia were collected for histology and microarray analysis. 4 kDa FITC-dex- 
tran (Sigma-Aldrich, USA) was added to the apical chamber of the Transwell 
chambers for a final concentration of 0.5 mg ml”'. Media was removed from the 
bottom compartment after different incubation times and fluorescence was read 
by fluorometer (Infinite M1000 PRO, excitation 490 nm, emission 520 nm, 
Tecan, USA). 

Implantation of intestinal stem cells. Intestinal stem cells (1.5 million cells) 
from different pedigrees with 50% of Matrigel (BD Bioscience, USA) were sub- 
cutaneously implanted into female, six- to eight-week-old immunodeficient 
(NOD.Cg-Prkde*“4 Targ’™“/SzJ) mice under IACUC approval (100533- 
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1115) To test spontaneous transformation of the stem cells, mice were monitored 
every month (up to 4 months). 

RNA and genomic DNA sample preparation. For stem cell colonies, RNA was 
isolated using PicoPure RNA Isolation Kit (Life Technologies, USA). For ALI- 
differentiated epithelia, RNA was isolated using TRIzol RNA Isolation Kit (Life 
Technologies, USA). RNA quality (RNA integrity number, RIN) was measured 
by analysis Agilent 2100 Bioanalyzer and Agilent RNA 6000 Nano Kit (Agilent 
Technologies, USA). RNAs having a RIN > 8 were used for microarray analysis. 
Genomic DNA was extracted with DNeasy Blood & Tissue kit (Qiagen, 
Netherlands) from intestinal and colonic stem cells for CNV analysis and exome 
capture sequencing. For genomic DNA extraction, human intestinal and colonic 
stem cells were isolated from mouse 3T3 feeder layer using QuadroMACS 
Starting Kit (Miltenyi Biotec, Germany). Genomic DNA concentration was mea- 
sured with Qubit dsDNA BR Assay Kit (Life Technologies, USA). 

Expression microarray and bioinformatics. Total RNAs obtained from imma- 
ture colonies and ALI-differentiated structure were used for microarray prepara- 
tion with WT Pico RNA Amplification System V2 for amplification of DNA and 
Encore Biotin Module for fragmentation and biotin labelling (NuGEN 
Technologies, USA). RNA quality (RNA integrity number, RIN) was measured 
by analysis using an Agilent 2100 Bioanalyzer and Agilent RNA 6000 Nano Kit 
(Agilent Technologies, USA). RNAs having a RIN > 8 were used for microarray 
analysis. All samples were prepared according to manufacturer’s instructions and 
hybridized onto GeneChip Human Exon 1.0 ST Array (Affymetrix, USA). 
GeneChip operating software was used to process all the Cel files and calculate 
probe intensity values. To validate sample quality, quality checks were conducted 
using Affymetrix Expression Console software. The intensity values were log,- 
transformed and imported into the Partek Genomics Suite 6.6 (Partek 
Incorporated, USA). Exons were summarized to genes and a 1-way ANOVA 
was performed to identify differentially expressed genes. For two-sample statist- 
ics, P values were calculated by Student’s t-test for each analysis. Unsupervised 
clustering and heat map generation were performed with sorted data sets by 
Euclidean distance based on average linkage clustering, and principal component 
analysis (PCA) map was conducted using all or selected probe sets by Partek 
Genomics Suite 6.6. Gene set enrichment analysis (GSEA)*’ was performed for C. 
difficile toxin B treatment. For the region-specific gene signature of small intestine 
and colon comparison (PD, PJ and MI for Fig. 2b and AC, TC and DC for Fig. 3b), 
differentially-expressed genes were selected with a cut-off value of 1.5-fold and 
P < 0.05 in each comparison (for example, (1) PD vs. PJ and (2) PD vs. MI) and 
then intersected genes in 2 gene lists of each comparison were taken as regio- 
specific gene sets. In the heat maps (Fig. 2b and 3b), 3 regiospecific gene sets (PD, 
PJ and MI, or AC, TC and DC) were combined, and the heat maps were made 
with Euclidean distance based on average linkage clustering. For C. difficile toxin 
B treatment data sets, samples from indicated time points and dosages were 
compared with control (untreated samples). Differentially-expressed genes (two- 
fold upregulated and downregulated genes) were counted and plotted in 3D 
column plots (Extended Data Fig. 8c). In comparison of 500 pM 24 h toxin B 
treatment with control, 39 genes were significantly upregulated (cut-off value: 
3-fold and P < 0.05) and a heat map (Fig. 6e) was made with 39 genes using all 
samples. The whole genome expression data of 500 pM 24h toxin B treatment vs. 
control were applied to GSEA program to detect significantly enriched pathway 
in toxin B treatment. Selected pathways (from KEGG) were shown in Fig. 6d. 
Data sets generated for this study have been submitted to the National Center for 
Biotechnology Information Gene Expression Omnibus (GEO) database under 
accession number GSE66749. 

No statistical methods were used to predetermine sample size. 

Copy number variation. For copy number variation analysis of stem cell pedi- 
grees and passage 0 pooled sample, genomic DNA samples were genotyped with 
HumanOmniExpress BeadChip Kit for clone 1 and 2 (passage 5, 10, 15 and 20) 
(Illumina, USA) and Illumina HumanOmniZhonghua BeadChip Kit for clones 3 
to 7 (passage 5 and 25) following the manufacturer’s instructions. Analysis of 
BeadChip was performed using GenomeStudio Software (Illumina, USA). 
Illumina high-density SNP genotyping data was converted to kilobase-resolution 
detection of copy number variation. CNV detected in passage 0 pooled samples 
are considered as germline CNVs and removed in the analysis. The data was 
generated by PennCNV™. Genes within 10 kb of CNV regions are reported. The 
parameter is set as “-expandleft 10k” and “-expandright 10k”. Other parameters 
are default. Confidence score >10 was used as a cutoff. The call rates for CNV 
were all greater than 99%, and two larger CNV amplification and deletion events 
were validated by quantitative PCR. 

Exome capture sequencing. For exome capture and high-throughput sequencing 
for intestinal stem cells (pedigree 1 and 2), 50 ng of genomic DNA was used to 
perform Nextera Expanded Exome Kit (Illumina, USA). For pedigree 3 to 7, 1 pg 
of genomic DNA was sheared using a Covaris S1 Ultrasonicator (Covaris, USA), 
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end-repaired, A-tailed, and Adaptor-ligated. Exome capture was performed using 
a Tru-seq Exome Enrichment Kit (Illumina, USA) following the manufacturer's 
instructions. Multiplexed libraries were sequenced on an Illumina HiSeq sequen- 
cer using 101-bp paired-end reads. Reads were aligned to the reference genome 
(UCSC hg19) using Burrows Wheeler Aligner (BWA, 0.6.2)°°. PCR duplicates 
were removed using PICARD-1.94 (http://picard.sourceforge.net). The Genome 
Analysis Toolkit (GATK framework version 2.6.4)°° was used to realign reads 
near indels and to recalibrate base quality values. 

When running GATK, the minimum phred-scaled confidence threshold at 
which variants were called (-stand_call_conf) was 50, and the minimum phred- 
scaled confidence threshold at which variants were emitted (-stand_emit_conf) is 
30. The criteria of GATK Variant Filtration is as follows: --clusterWindowSize 
10--filterExpression “MQ0 > 4 && ((MQ0/(1.0*DP)) > 0.1)”--filterName “HARD_ 
TO_VALIDATE”--filterExpression “DP <5”--filterName “LowCoverage”-- 
filterExpression “QUAL < 30”--filterName “VeryLowQual”--filterExpression 
“QUAL>30 && QUAL<50”--filterName “LowQual’--filterExpression 
“QD <1.5”--filterName “LowQD”--filterExpression “FS > 150”--filterName 
“StrandBias”. Potential mouse genomic DNA contaminant reads were detected 
by alignment to the mouse genome (UCSC mm10) and those containing less 
than 3 mismatches were removed from further analysis. SNVs were called in each 
sample separately using SAMtools v0.1.19°’ and GATK in the exome capture 
targeted regions. Variants with at least Q50 confidence, phred-scaled quality 
score more than 40 and coverage higher than 10 were considered as true SNVs. 
Variants were annotated with ANNOVAR (version 11 Feb, 2013)**. Identical 
variant calls in intestinal stem cells (passage 5 and higher) when compared to 
passage 0 pooled samples were used to identify germline SNVs. Sanger sequen- 
cing validation was performed using primers designed with Primer3 software 


version 4.0 (http://frodo.wi.mit.edu/). Extracted genomic DNA was amplified 
with titanium Taq polymerase (Clontech Laboratories, CA, USA) and purified 
PCR products were sequenced in the forward directions using ABI PRISM 
BigDye Terminator Cycle Sequencing Ready Reaction kits and an ABI PRISM 
3730 Genetic Analyzer (Applied Biosystems, CA, USA). We validated by PCR 
and Sanger sequencing 13 of 14 non-synonymous mutations called by our 
sequencing efforts suggesting a false discovery rate of less than 10%. Other quality 
control parameters are shown in Supplementary Information Table 4. 
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Extended Data Figure 1 | Loss of clonogenicity in differentiated ISC. 

a, Schematic of ISC differentiation using either the y-secretase inhibitor 
dibenzazepine (DBZ) or withdrawal of the Wnt regulator R-spondin 1 (Rspo1). 
ISCs were plated on day 0, DBZ added or Rspol removed at day 2, and colonies 
passaged en masse at day 7. At day 14, after 7 days of continuous growth, 
colonies were counted. b, Micrographs show immunofluorescence at day 7 
colonies grown without Rspol or in the presence of DBZ for 5 days using 
antibodies to Ki67, chromogranin A (CHGA), keratin 20 (Krt20), E-cadherin 
(E-cad), and mucin 2 (Muc2). Scale bar, 50 um; n = 4 technical replicates. 

c, Histogram shows colony formation in each condition normalized to control 
ISCs. n = 4 biological replicates; error bars, s.d. d, Staining of ALI-differentiated 
intestinal stem cells with monoclonal antibody HD6 directed to Paneth cells. 
Scale bar, 50 tm; m = 4 technical replicates. 
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a Gene 

symbol 

Du-high ABCBE ABHD2 ABHD2 ACOX1 ADAM28 AHNAK AKR1B10 ALDH3A1 ANXA1 ANXA10 ANXAS5 ARHGAP24 

133 genes ARL4C BACE2 BCAS1 BICC1 BTD Cltorf9 C4orf34 CA2 CAPG CAPN6 CD24 CD59 
CLON18 CNOT7 CRIP1 cTso CXCL17 CYP2C18 CYP3A5 CYSTM1 ONM2 DPCR1 EHD2 EPB41L1 
FAM110B FAM177B FAM189A2 FN1 FOSB FSIP2 FUTS FXYD3 GALNT3 GALNT7 GNAQ GPR87 
GPRC5B- HIPK2 HMGCS2 HOXBS HS3ST5 HSPB1 HTR1B IL18R1 IL2RA KCNE3 KLF4 LEPREL1 
LGMN LPAR1 LRP1 LYPD6B MAOB MEIS2 METTL7A MFSD1 MITF MLPH MSMB mMuUCi 
MXD1 MYEOV NDUFB1P1 NEK6 NFATS NKX6-3 NTRK2 OASL P4HA1 PART1 PCDH7 PGC 
PLA2G10 PLXNA2 PP7080 PPARGC1A PSCA PVRIG PXOC1 Qk! RAB27B RBMS1 RERG RETSAT 
RGNEF RHBDL2 RHOBTB1 RNF128 RNF183 ROBO1 $100P SCIN SFTA2 SGK2 SLC16A3 SLC19A3 
SLC26A9 SLC41A2 SLC44A4 SLC45A3 SLC4A4 SLC9A1 SLC9A2 SLC9A3 SMPDL3A SPINK1 STEAP1 SULT1C2 
SYT9 TFF1 TFF2 TFF3 TM4SF1 TPBG TRAK1 TSHZ2 TSPAN1 TSPAN31 UGDH VSIG1 
NSIG2 

Je-high BOKRB2 C2CD4A CCRI CHRM3 CPVL CYP2A13 EPHB1 FAM47B FMOD GLOC HSD17B7P2 IFITM3 

_24 genes LGRS ODAM OR8J1 PHYHIPL RTP4 SESN1 SHPK SLC26A2_ STARD13 TPH1 UNC93A ZRSR2 

Il-high ABCB1 ABCC2 ABCG2 ACE2 ACOX2 ADH4 ADH6 AKAP7 ALDOB ANPEP ANXA2P2 APOB 

178 genes AREG BINL3 C17orf72 Ctorf201 Ctorf21 C3orf26 C3orf52 C4BPB CACNA1E CCL25 CCND2 CD52 
CDH17 CDX2 CEACAM1 CEACAMS CELA3A CELA3B CHGA CIDEC CLCA1 CLIC5 CLRN3 CPE 
CSFIR CYBRD1 CYP2A7 DACH1 DENND1A DHRS11 DKK1 DMBT1 DOK3 DOPP4 DSG3 DUSP5 
EFCAB4B EGR2 EML1 EREG F2R FABP2 FABP6 FAM105A FCGBP FGF23 FITM2 FOLH1 
FRZB GALNT8 GBA3 GCET2 GDAP1 GFOD1 GHRL GIP GJA1 GLS GPA33 GSDMB 
GSTA2 GUCY2C HEPACAM2 HHLA2 HLA-DRB1 HNF4G HTR1D 1L18 IL2RG 1L32 INE1 IRF8 
ITLN1 JAG1 KIAA0226L KIRREL KLF7 KRT20 KRT33B KRT80 L1TD1 LcT LEAP2 LGALS2 
LGALS3 LINC00483 LOC100132099 MANBAL MAOA MARCH3 MARCHS MB21D2 MEPIA MEP1B MICAL2 MIR17HG 
MLN MOGAT3 MRPS18A MUC13 MUC1I7 MYOTA MYO1E MYO7B NABP1 NELL2 NIPAL1 NMES 
NODAL NOX1 NPY6R NR1H4 O3FAR1 OSR2 OSTBETA OSTalpha OTC PADI2 PAPSS2 PDE10A 
PDE3A POP1 PI3 PLA2G12B PLK1S1 PMP22 PRAP1 RASGRF2 RBP2 RGS2 RHOB RNF182 
RNF217 SARM1 SATB2 SEMA3D SEMA6D SERTAD1 SI SIDT1 SLC17A4 SLC2A5 SLC30A2 SLC46A3 
SLC6A20 SLC7A6 SNX10 TCEANC TGFBI THEM4 TM4SF20 TM6SF2 TMEM45B TRIM36 TUBAL3 TUFT1 
UGT2B15 VIN XDH YAE1D1 ZG16 ZNF208 ZNF347 ZNF502 ZNF705G ZYX 

b Duodenum SC ALI Cc Jejunum SC ALI 


Extended Data Figure 2 | Intestinal stem cell expression profiles. a, List of | Tff2, mucin 5AC, villin, E-cadherin, and mucin 2. c, Immunofluorescence 
genes differentially expressed in ISC derived from duodenum, jejunum and labelling of ALI-differentiated epithelia from jejunum stem cells with 

ileum. These data correspond to heat map of Fig. 2b. b, Immunofluorescence _ antibodies to E-cadherin, mucin 2, villin, and mucin 5AC. Scale bar, 50 um; 
labelling of ALI-differentiated ISCs from duodenum with antibodies against n = 10 technical replicates. 
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Extended Data Figure 3 | Differential gene expression in epithelia derived from colonic stem cells. Heat map of differentially expressed (>1.5-fold, 


P < 0.05) genes in ALI cultures derived from stem cell pedigrees of ascending, transverse, and descending colon. 
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Extended Data Figure 4 | Differential gene expression across columnar and 
stratified epithelial stem cells. a, Histograms of expression microarray signal 
intensity of selected genes across averaged intestine and colon ISCs, stratified 
epithelial stem cells, and stem cells of the fallopian tube (FT). Biological 
replicas n = 2-6 (FT = 2, stratified epithelia = 3, colon, intestine = 6); error 
bars, s.d. b, Dot plot showing expression microarray data of indicated genes 
for stem cell pedigrees (ISC; Duo, duodenum; Jej, jejunum; Ile, ileum; AC, 
ascending colon; TC, transverse colon; DC, descending colon) derived from 
various regions of the intestinal tract before and after air—liquid interface (ALI) 
differentiation. Biological replicas n = 2 (total 12 data sets) for stem cells, 
technical replicas n = 2 for ALI. c, Chart of aggregate P values by Student’s t-test 
for gene expression changes between ground state stem cells and their 
ALI-differentiated counterparts. 
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Extended Data Figure 5 | Genes affected by CNV and SNV events in 
intestinal stem cell pedigrees during passaging. a, Summary of CNV (events 
(genes affected)) and non-synonymous SNV in pedigrees 1 and 2 at P5 to P20. 
b, Summary of genes altered by interstitial CNV amplifications (top) or 
deletions (bottom) in ISC pedigrees 3 to 7 at P5 and P25. c, Summary of genes 
sustaining non-synonymous SNV in five ISC pedigrees at P5 and P25. 
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Extended Data Figure 6 | Whole-genome CNV profiles for intestinal stem 
cell pedigrees 3-7 at P5 and P25. Regions marked by ovals represent 
aneuploidy. 
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Extended Data Figure 7 | Impact of ISC“ passaging on ALI differentiation. 
ALI differentiation of intestinal pedigree 2 initiated from cells at the indicated 
passage number. As indicated, histological sections of differentiated epithelia 

were stained with antibodies to either E-cadherin (ECAD, green) and mucin 

2 (Muc2, red), or Ki67 (green) and chromogranin A (CHGA, red). Scale bar, 

75 um; n = 4 technical replicates. 
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Extended Data Figure 8 | ISC°* tumorigenicity assays in immunodeficient 
mice. a, Quantification of tumour formation assessments at 4-16 weeks 
following subcutaneous inoculation of two million cells of the indicated ISC 
pedigrees at passage 6 or passage 25 at 4-16 weeks. ‘Pool’ indicates total set of 
clones derived from PO ileum culture before pedigree generation. ‘Cancer cells’ 
refers to propagating cells from case of high-grade serous ovarian cancer. 

b, Left, histological section through site of injection of 1 million cells from 
pedigree 3. Right, section of injection site stained with antibody (STEM121) to 
human epithelial cells (brown) revealing benign cysts. Scale bar, 15 jum. 
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Extended Data Figure 9 | Dose- and time-dependency of TcdB pathology in 
ALI-generated colonic epithelia. a, Immunofluorescence localization of 
adherens junction marker E-cadherin and tight junction marker claudin 3 in 
ALI-differentiated epithelia derived from transverse colon stem cells following 
exposure to 100 pM TcdB for the indicated durations. n = 4 technical 
replicates. Scale bar, 100 jim. b, Representative H&E images of ALI cultures at 
indicated times and concentration of TcdB exposure. Scale bar, 250 um; n = 4 
technical replicates. c, Gene set enrichment analysis of whole-genome 
expression data from colonic epithelia treated with 500pM TcdB for 24 h and 
control samples showing enriched KEGG pathway sets. NES, normalized 
enrichment score; NOM P value, nominal P value. d, 3D plot of upregulated 
genes at the indicated time points and dosages > twofold, P < 0.05). n = 2 
technical replicates. e, Heat map of upregulated genes in 500 pM TcdB samples. 
The genes (237 genes) were chosen by cutoff values (> twofold, P < 0.05). 
Three time points (8, 16 and 24 h) are shown. f, 3D plot of downregulated 
genes at the indicated time points and dosages > twofold, P < 0.05). n = 2 


technical replicates. 
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Extended Data Figure 10 | Dose- and time-dependency of TcdA pathology 
in ALI-generated colonic epithelia. a, Left, representative H&E images of 
ALI cultures at indicated times and concentration of TcdA exposure; right, 
immunofluorescence localization of adherens junction marker E-cadherin 
(ECAD; green) and mucin 2 (MUC2; red) in ALI-differentiated epithelia 
derived from transverse colon stem cells following incubation with 10 nM 
TcdA for the indicated durations. Scale bar, 100 jim; n = 4 technical replicates. 
b, 3D plot of histological scoring of representative H&E time points and con- 
centrations performed by a gastrointestinal pathologist according to a standard 
0-3 rating for colonic epithelial integrity. c, Distribution of tight junction 
marker claudin 3 (Cldn3) and adherens junction marker (Cdh17) following 
treatment of ALI colonic epithelium with TcdA for the indicated times and 
doses. Scale bar, 50 1m; n = 4 technical replicates. d, Histogram of permeability 
of ALI colonic epithelium (Papp) to small molecules (FD4, molecular mass 
4,400 Da) following exposure to the indicated doses of TcdA for the 
indicated times. 
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Chromothripsis from DNA damage 


in micronuclei 


Cheng-Zhong Zhang’**4*, Alexander Spektor”+°*, Hauke Cornils”**, Joshua M. Francis'**, Emily K. Jackson”®, Shiwei Liu?*, 


Matthew Meyerson’*”"8 & David Pellman??*° 


Genome sequencing has uncovered a new mutational phenomenon in cancer and congenital disorders called 
chromothripsis. Chromothripsis is characterized by extensive genomic rearrangements and an oscillating pattern of 
DNA copy number levels, all curiously restricted to one or a few chromosomes. The mechanism for chromothripsis is 
unknown, but we previously proposed that it could occur through the physical isolation of chromosomes in aberrant 
nuclear structures called micronuclei. Here, using a combination of live cell imaging and single-cell genome sequencing, 
we demonstrate that micronucleus formation can indeed generate a spectrum of genomic rearrangements, some of 
which recapitulate all known features of chromothripsis. These events are restricted to the mis-segregated chromosome 
and occur within one cell division. We demonstrate that the mechanism for chromothripsis can involve the 
fragmentation and subsequent reassembly of a single chromatid from a micronucleus. Collectively, these experiments 
establish a new mutational process of which chromothripsis is one extreme outcome. 


Most cancer genomes are extensively altered by point mutations and 
chromosome rearrangements. Although mutations are generally 
thought to accumulate gradually, over many cell division cycles’, 
recent cancer genome sequencing provides evidence for mutational 
processes that generate multiple mutations “all-at-once” during a 
single cell cycle’. The most striking example of such an event is 
“chromothripsis”, where a unique pattern of clustered rearrange- 
ments occurs, typically involving only a single chromosome or a 
few chromosomes*”. 

Several models have been proposed to explain the rearrangements in 
chromothripsis. One proposal is that the affected chromosome is some- 
how fragmented, with random joining of some segments and loss of 
others*. This model explains the characteristic pattern of DNA copy 
number in chromothripsis—oscillation between two copy number 
states, with islands of DNA retention and heterozygosity interspersed 
with regions of DNA loss. An alternative hypothesis is that chromo- 
thripsis is generated by DNA replication errors: collapsed replication 
forks trigger cycles of microhomology-mediated break-induced replica- 
tion (MMBIR), where distal sequences are copied to the sites of replica- 
tion fork collapse by template-switching*. Evidence for the latter model 
comes from templated insertions detected at translocation junctions 
and sequence triplications*”. Both models have only indirect support 
from genomic sequencing and have not been tested experimentally”. 

We recently proposed that the physical isolation of chromosomes 
in aberrant nuclear structures called micronuclei might explain the 
localization of DNA lesions in chromothripsis''. Micronuclei are a 
common outcome of many cell division defects, including mitotic 
errors that mis-segregate intact chromosomes, and errors in DNA 
replication or repair that generate acentric chromosome frag- 
ments’*"’. We previously found that the partitioning of intact chro- 
mosomes into newly formed micronuclei leads to cytological evidence 
of DNA damage, specifically on the mis-segregated chromosome"’. 
After mitosis, chromosomes from micronuclei can be reincorporated 


into daughter nuclei'’, potentially integrating mutations from the 
micronucleus into the genome. 

Here, using an approach combining live cell imaging with single-cell 
genomic analysis that we call “Look-Seq’, we demonstrate that micro- 
nucleus formation can generate a spectrum of complex chromosomal 
rearrangements, providing the first direct experimental evidence for a 
mechanism leading to chromothripsis. 


Damage to micronuclei after S phase entry 

To determine if micronucleus formation leads to chromosome rear- 
rangements, we first sought to clarify the cell population where rear- 
rangements would most likely occur. Previously, we found that newly- 
formed micronuclei do not have marked levels of DNA damage in G1, 
but damaged micronuclei accumulate as cells progress into the S and 
G2 phases of the cell cycle’’, suggesting a link between DNA damage 
and DNA replication. Additionally or alternatively, the nuclear envel- 
opes of micronuclei are prone to irreversible “rupture” as defined by 
the abrupt loss of soluble nuclear proteins’*. Nuclear envelope rupture 
in micronuclei is strongly associated with DNA damage, but occurs at 
random, not specifically during S phase’*. 

To re-examine the timing of DNA damage, micronuclei were gen- 
erated in synchronized cells by a nocodazole release procedure’’. As 
expected''"’, no significant DNA damage was detected in ruptured 
micronuclei during G1, but damage was common during S and G2 
phases as indicated by fluorescence labelling for y-H2AX or Gam, a 
bacteriophage protein that marks double strand breaks’* (Extended 
Data Fig. 1a, b). Moreover, micronuclei from serum-starved G0 cells 
showed little detectable DNA damage, despite rupture of the micro- 
nuclear envelope during GO'* (Extended Data Fig. 1c). Therefore, 
DNA damage is not triggered by nuclear envelope rupture alone, 
but also requires entry into S phase. 

Consistent with this conclusion, cell labelling with 5-ethynyl-2’- 
deoxyuridine (EdU) demonstrated that most damaged micronuclei 
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had initiated DNA replication (Extended Data Fig. 1d). However, 
overall EdU incorporation was markedly lower in micronuclei com- 
pared to the cell’s primary nucleus, irrespective of whether the micro- 
nuclei were ruptured or intact'’* (Fig. 1a). Thus, chromosomes in 
micronuclei are under-replicated in general, even though the majority 
of the damaged micronuclei have initiated DNA replication. These 
results focused the experiments below on micronuclei that rupture 
after S phase entry. 


Look-Seq strategy 

We designed a procedure to determine the genomic consequences of 
DNA damage in ruptured micronuclei (Fig. 1b). Non-transformed 
human RPE-1 cells were synchronized by nocodazole release, sorted 
into 384 well plates, and wells containing a single micronucleated cell 
were identified. By live cell imaging, we identified cells where the 
micronuclear envelope ruptured after the beginning of S phase 
(Methods). These experiments were performed after small interfering 
RNA (siRNA)-mediated knockdown of p53 because chromosome 
mis-segregation after nocodazole release induces a p53-dependent 
G1 cell cycle arrest’*'’. After one division of the micronucleated cell, 
we selected daughters with no detectable micronuclei, indicating the 
micronuclear chromosome was reincorporated into daughters’ prim- 
ary nuclei. Cells with reincorporation were selected because nuclear 
envelope rupture inactivates nuclear processes such as DNA replica- 
tion and transcription’, and we presume that the damaged chro- 
mosome from the micronucleus may require exposure to a normal 
nucleoplasm, with functional DNA repair pathways, to generate rear- 
rangements. Daughter cells were then separated, amplified (multi- 
strand displacement amplification, MDA), sequenced, and analysed 
independently'® (see Methods, Supplementary Table 1). 


Identifying the micronuclear chromosome 


Micronucleated cells will be of two kinds: disomic for the chro- 
mosome in the micronucleus (Fig. 2a, left schematic), if the lagging 
chromosome was segregated into the correct daughter cell (albeit 
partitioned into a micronucleus); or trisomic for the affected chro- 
mosome (Fig. 2a, right schematic), if the lagging chromosome was 
mis-segregated. The division of a disomic micronucleated cell will 
produce one near-disomic daughter with the under-replicated chro- 
mosome from the micronucleus, and one monosomic daughter, here- 
after referred to as a 2:1 asymmetric copy number pattern. Similarly, 
the division of a trisomic micronucleated cell will produce one 
near-trisomic daughter with the micronuclear chromosome, and 
one disomic daughter, or a 3:2 pattern. In either pattern, we refer 
to the daughter cell with the higher DNA copy number as the ‘plus’ 
cell and the other daughter as the ‘minus’ cell. Thus, reduced DNA 
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Figure 1 | Look-Seq procedure to analyse DNA damage from the under- 
replicated chromosomes in micronuclei. a, Reduced DNA replication in both 
intact and ruptured micronuclei. Left, fluorescence intensity measurements 
after continuous EdU labelling following release from a nocodazole block. EdU 
intensity is normalized to nuclear area (N > 100 from two experiments for each 
category, see Methods). Red bars, mean and standard deviation. b, Look-Seq 
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replication in the ruptured micronucleus creates an odd chromosome 
number and a copy number asymmetry between the daughters. This 
asymmetry identifies the mis-segregated chromosome and demon- 
strates that mis-segregation is a de novo event that occurred during the 
last cell division (Fig. 2a). Because the plus daughter has the chro- 
mosome from the micronucleus, rearrangements, if observed, should 
be concentrated in the plus daughter and associated with the ‘extra’ 
chromatid from the micronucleus; we refer to the haplotype of this 
chromatid as the ‘gained’ or ‘mis-segregated’ haplotype. 

We sequenced 10 control daughter cell pairs from non-micronu- 
cleated mother cells and 9 experimental pairs derived from micronu- 
cleated cells (Supplementary Videos 1-9). All controls underwent 
siRNA-mediated knockdown of p53 (C1-C4 and N1-N6), and six 
of these also were carried through the nocodazole release protocol 
(N1-N6). DNA copy-number analysis (Methods, Supplementary 
Table 2) identified the known clonal gains of chromosome (Chr.) 
10q and subclonal gains of Chr. 12 in the RPE-1 line (ref. 11 and 
Fig. 2b), validating our arm-level copy number measurements. As 
expected, sporadic chromosome gains or losses—shared between 
the daughters—were also detected. These shared aneuploidies were 
presumably present in the mother-cell primary nucleus and under- 
went normal replication and segregation. 

In contrast to the non-micronucleated controls (Fig. 2b, top and 
middle panels), the 9 daughter pairs derived from cells with ruptured 
micronuclei (MN) all contained at least one chromosome with copy 
number asymmetry, with either a 2:1 (MN1-MN6) or 3:2 (MN7- 
MN39) segregation pattern (Fig. 2b, bottom panel). In the pair of 
MN8 daughter cells, the 3:2 ratio for both Chr. 4 and Chr. 11 
(Fig. 2b) suggests that these chromosomes were both in the micro- 
nucleus observed by imaging. In the MN7 daughters, we observed a 
3:2 ratio only for the q arm of Chr. 1 (Fig. 2b; also see Extended Data 
Fig. 3). Most likely, an acentric Chr. 1q fragment was generated by the 
cleavage of a chromosome bridge from the previous mitosis’’, and 
partitioned into the micronucleus in the MN7 mother. 

We developed a method to determine loss-of-heterozygosity 
(LOH) in single-cell genomes (Methods) that is insensitive to the 
amplification bias inherent to MDA”. This analysis confirmed genuine 
monosomy of chromosomes in the minus daughters of 2:1 mis-segre- 
gations (Extended Data Fig. 2a—c). From the sequencing of these hemi- 
zygous chromosomes we determined the haplotype phase (genotypes 
at polymorphic sites for each homologue) and devised a method to 
measure the copy number for each homologue (Extended Data Fig. 3a, 
Methods). This haplotype copy number information enabled us to 
identify amplification bias affecting both homologues equally 
(Extended Data Fig. 3b) and distinguish it from true copy number 
alterations affecting one homologue (Extended Data Fig. 3c); this 
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strategy. Nuclei are labelled with green fluorescent protein-conjugated histone 
H2B (GFP-H2B). Disruption of the micronuclear envelope is visualized 

by the loss of a nuclear localized red fluorescent protein (RFP-NLS); 
reincorporation of the micronucleus is inferred from the absence of micro- 
nuclei in either G1 daughter. 
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Figure 2 | Identification of the mis-segregated chromosome by DNA copy 
number analysis. a, Predicted DNA copy number outcomes for daughter 
cells derived from a micronucleated mother cell. Left, lagging chromosome 
(magenta arrow) is correctly segregated, but partitioned into a micronucleus 
(MN); right, lagging chromosome is mis-segregated into a micronucleus. The 
chromosome in the micronucleus is under-replicated and asymmetrically 
segregated, resulting in either a 2:1 (left) or 3:2 (right) copy number ratio in 
daughter cells. Homologue in the micronucleus, light green; homologue in the 
primary nucleus (PN), blue; red outline indicates the under-replicated 
chromosome from the micronucleus. Hatched box, ‘plus’ daughter; solid box, 
‘minus’ daughter. b, Heat map of arm-level DNA copy number (log, ratio) 
in daughter cells derived from micronucleated mothers (MN1-MN9) or 


analysis further confirmed the predicted 3:2 mis-segregation pattern 
(Fig. 2a, right schematic). 


Localized chromosome rearrangements 

We next tested the prediction that there would be a concentration of 
rearrangements on the mis-segregated chromosome. De novo rearran- 
gements were detected by clustering of discordant read pairs’* 
(Methods, Supplementary Table 3). Rearrangements on the normally 
segregated control chromosomes were uniformly distributed (Extended 
Data Fig. 4a), as would be expected for background errors from MDA- 
based whole-genome amplification. By contrast, there was a significant 
enrichment of rearrangements (median: 12.5-fold) on the mis-segre- 
gated chromosomes identified by copy number asymmetry relative to 
the normally segregated control chromosomes (Extended Data Fig. 4b, 
top panel). Remarkably, the concentration of rearrangements on the 
mis-segregated chromosomes was observed in 8 of 9 post-micronuclear 
cell pairs (P< 10 “*, Bonferroni corrected one-sided Poisson test). 

As short-range inversions are amplification errors reported to 
occur frequently during MDA, we analysed inverted-type and non- 
inverted type rearrangements separately (Extended Data Fig. 5a-c, 
Methods). Power-law scaling analysis revealed that inversions were 
enriched at breakpoint distances <150 kilobase (kb) (Extended Data 
Fig. 5a, b) and were randomly distributed (Extended Data Fig. 4b), as 
expected for MDA errors (Methods). Long-range rearrangements 
(>150 kb or interchromosomal) on the normally segregated 


non-micronucleated controls (C1-C4, N1-N6, see Supplementary Table 2). 
Mother cells are listed on the left with daughters, ‘a’ and ‘b’. For micronucleated 
mothers, ‘a’ (in red) represents the plus daughter and ‘b’ (in blue) represents 
the minus daughter. Chromosomes are shown on the x axis; dotted lines 
separate p and q arms. Pre-existing aneuploidies in mother cell, dashed boxes; 
de novo mis-segregations with copy number asymmetry, solid boxes. Top 
panel, bulk RPE-1 line; top middle panel, daughters from non-micronucleated 
mothers; middle panel, non-micronucleated controls after nocodazole release; 
bottom panel, daughter cells from micronucleated mothers. For N6, one 
daughter cell divided once, producing three cells. For MN9, extra time was 
provided for additional cell divisions: the plus daughter did not divide whereas 
the minus daughter divided twice, generating five cells in total. 


chromosome were also randomly distributed (Supplementary 
Tables 4 and 5). By contrast, the enrichment of long-range rearrange- 
ments specifically on the mis-segregated chromosomes was even 
more significant after elimination of short-range rearrangements 
(Fig. 3a, Extended Data Fig. 4b). 

As predicted, the rearrangements in the mis-segregated chro- 
mosome occurred predominantly in the plus daughter cell (Fig. 3b, 
Extended Data Fig. 4c), with a few informative exceptions to be dis- 
cussed below. PCR amplification across rearrangement junctions 
from whole-genome-amplified DNA (Extended Data Fig. 6a) con- 
firmed the rearrangements, but could not exclude amplification 
errors. However, by sequencing rearrangement junctions with nearby 
heterozygous sites (Extended Data Fig. 6b-d, Methods), we found that 
every rearrangement tested was associated with the gained haplotype 
(Extended Data Table 1, Supplementary Table 5), indicating that 
the rearrangements occurred on the mis-segregated chromatid. 
Interestingly, we sometimes detected an unaltered product in addition 
to the rearranged product with the mis-segregated haplotype. We 
hypothesize that these two products may be generated by breakage 
of a partially replicated sequence near a replication fork with only one 
side of the break participating in the rearrangement (Extended Data 
Fig. 6e). Thus, there is a marked concentration of long-range rearran- 
gements associated with the gained haplotype in the plus daughter 
cell, indicating that these rearrangements originate from the breakage 
of the chromatid in the ruptured micronucleus. 
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The translocation junctions for long-range rearrangements had 
several notable features. Unlike control chromosomes, microhomol- 
ogy was observed at >50% of the junctions from the mis-segregated 
chromosomes, a higher frequency than expected by chance”! 
(Extended Data Fig. 7a). Microhomology could originate from 
alternative non-homologous end joining or MMBIR*. In the MN6 
plus daughter, 14 out of 16 breakpoints formed an uninterrupted 
chain (Extended Data Fig. 7b). These chained translocations resemble 


Figure 3 | Enrichment of long-range 
rearrangements on the mis-segregated 
chromosome in the predicted daughter cell. 

a, Left, frequency of long-range rearrangements 
detected in the mis-segregated chromosomes (both 
plus and minus cell, red bars) compared with the 
remaining chromosomes (grey bars): number of 
breakpoints per Mb is normalized for DNA copy 
number and the detection sensitivity. Right, P 
values for enrichment derived from a one-sided 
Poisson test (Methods). b, CIRCOS plots showing 
DNA copy number (grey histograms) and long- 
range intrachromosomal rearrangements (green 
links) for four daughter pairs from micronucleated 
mothers. Outside ring, chromosome banding 
pattern. Grey histograms, DNA copy number 

(5 Mb bins) for the minus cell (outer ring) and the 
plus cell (inner). (For MN9, one ‘grandchild’ of 
the minus daughter is shown, see Fig. 2). Red dots, 
bins with significant gains (>1.35 < mean); blue 
dots, bins with significant loss (<0.6 X mean). Loss 
of 19p and gain of Xq are seen in all single cell 
samples but not in the bulk and are attributed to 
systematic amplification bias (Extended Data 

Fig. 3b). Light blue cones, chromosomes with 2:1 
mis-segregations; orange cones, 3:2 mis- 
segregations. 


examples of germline chromothripsis” that are presumed to preserve 
chromosome copy number because of the selection for viability*. At 
some translocation junctions, we also identified short insertions (50- 
500 base pairs, bp) originating from other widely dispersed sites on 
the mis-segregated chromosome (Extended Data Fig. 7c). In one 
example (MN9), 8 short segments from all over Chr. 8 were inserted 
into a single junction, also on Chr. 8. Such short ‘templated’ insertions 
are a well-described characteristic of MMBIR’. 


Figure 4 | Two-state oscillating copy-number 
patterns characteristic of chromothripsis. 

a, CIRCOS plots for three daughter cell pairs where 
both daughters received fragments of the mis- 
segregated chromatid. The MN8 minus daughter 
contains two Chr. 4 fragments missing from the 
plus daughter; the MN2 minus daughter contains 
one segment of Chr. 2q missing from the plus 
daughter; for the MN4 pair, dozens of Chr. 3 
fragments are reciprocally distributed between the 
daughters. b, Reciprocal distribution of fragments 
of the mis-segregated chromatid results in two- 
state copy number oscillations (retention (Ret.) or 
deletion (Del.)) of the mis-segregated haplotype in 
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In the MN8 daughters, both Chrs. 4 and 11 were mis-segregated 
with a 3:2 ratio (Fig. 2b). Interestingly, in the plus cell, we not only 
detected intrachromosomal rearrangements in Chrs. 4 and 11, but 
also 8 translocations between these chromosomes (Fig. 4a, left panel). 
Thus, the MN8 mother probably had copies of Chrs. 4 and 11 within 
the single micronucleus detected by imaging, generating transloca- 
tions both within and between chromosomes. 


Evidence for chromosome fragmentation 


One distinguishing feature of chromothripsis is the oscillation of 
DNA copy number between only two states’. This feature of chromo- 
thripsis was identified on the mis-segregated chromosome in three 
cell pairs, MN2, MN4 and MN8 (Fig. 4a, Extended Data Fig. 8a-c), 
most strikingly in the MN4 daughters. For this pair, one intact Chr. 3 
haplotype was distributed evenly to both daughters (Extended Data 
Fig. 8b), whereas the other haplotype displayed a pattern of altern- 
ating retention and loss (Fig. 4b). Combining the copy number for 
both haplotypes yields an oscillating pattern: hemizygous regions 
with one copy alternate with heterozygous regions with two copies. 
Remarkably, segments of the fragmented haplotype that were gained 
in one daughter were almost always lost from the other and vice versa. 
Of all fragments of the mis-segregated Chr. 3 haplotype detected in 
either daughter, 97% were mutually exclusively distributed between 
the two daughters, with only 3% shared by both. Moreover, long- 
range rearrangements in both daughter cells were almost completely 
(>90%) restricted to regions where the fragmented haplotype was 
retained (P ~ 10 * for the plus daughter and 10 '* for the minus 
daughter, binomial test) and directly associated with the gained hap- 
lotype when adjacent polymorphisms were present (Fig. 4b). 
Therefore, the division of the MN4 mother cell generated the canon- 
ical features of chromothripsis in both daughters. A single chromatid 
from a micronucleus was fragmented and randomly distributed 
between daughter cells, followed by the joining of fragments in ran- 
dom order and orientation. Thus, the majority of DNA segments that 
are ‘lost’ in one cell are, in fact, distributed to the other. 


Potential nascent double minutes 


One way that chromothripsis may promote tumour development is 
by generating double minute chromosomes*”, small circular acentric 
chromosomes that can be present at very high copy number and carry 
oncogenes’. Intriguingly, in the MN4 daughters, we detected four 
1-3 megabase (Mb) circular chromosomes, which may represent 
examples of the initial step in generating double minutes” (Fig. 5). 
Evidence that these are true circular chromosome fragments not only 
comes from sequencing reads spanning the junctions, but also 
because the junctions fall at boundaries where the mis-segregated 
haplotype is deleted. These deletions confirm that the junctions occur 
at genuine break sites and also exclude the possibility that the junc- 
tions result from tandem duplications. 


Discussion 


The experiments described here define a new mutational process that 
provides one mechanistic explanation for chromothripsis, a unique 
pattern of localized chromosome rearrangements observed in cancer 
and congenital diseases. By recapitulating chromothripsis in the 
laboratory, we establish that it can occur after an intact chromosome 
is partitioned into a micronucleus. These findings highlight the crit- 
ical importance of nuclear architecture and nuclear envelope integrity 
for the maintenance of genome stability in eukaryotic cells”. 

After the division of micronucleated cells, we observed extensive 
localized chromosome rearrangements, some of which bear all the hall- 
marks of chromothripsis. The following evidence indicates that these 
rearrangements occurred on the chromosome from the micronucleus. 
First, chromosomes in micronuclei are under-replicated and accu- 
mulate marked evidence of DNA damage’’*. Under-replication of 
the chromosome in the micronucleus creates a copy number asymmetry 
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Figure 5 | Circular chromosomal structures resulting from chromothripsis. 
a-d, Shown are four circular structures formed by single segments (a and b) or 
by multiple separate segments (c and d); a and c are from the MN4 plus 
daughter; b and d are from the MN4 minus daughter. In each panel, 
circularized fragments are shown with the reference coordinates on the left 
(S1 and S2 denote separate segments), with 5’ breaks (black triangles) and 3’ 
breaks (red triangles) linked by rearrangements indicated by dashed lines; 
resulting circular structures are illustrated on the right (not to scale). 


between the daughters, identifying the mis-segregated chromosome and 
establishing mis-segregation as a de novo event. Second, there is a highly 
significant enrichment of chromosome rearrangements only on the 
mis-segregated chromosome after the division of micronucleated cells; 
it is never observed on normally segregated chromosomes after division 
of either micronucleated or control cells. Third, the rearrangements are 
associated with the chromatid in the micronucleus identified from the 
gained haplotype. Thus, the partitioning of chromosomes into micro- 
nuclei is an ‘all-at-once’ catastrophe that can trigger extensive mutagen- 
esis at a surprisingly high frequency. 

Our results show that partitioning chromosomes into micronuclei 
can have diverse consequences for genome structure. In addition to 
intrachromosomal rearrangements, we find that the partitioning of 
more than one chromosome into one micronucleus can generate trans- 
locations between chromosomes. We speculate that the reassembly of 
fragments from one or more chromosomes in a micronucleus could 
generate ring chromosomes, which are observed in a number of human 
cancers, and whose formation has recently been suggested to involve 
chromothripsis**. We also observed circularized fragments originating 
from the mis-segregated chromosome. Circularization of shattered 
chromosome fragments provides an appealingly simple mechanism 
for the first step in generating double minute chromosomes™, which 
are frequent conduits for oncogene amplification in tumours**”* and 
were previously linked to chromothripsis’. 

Our experiments provide insight into the mechanism of chromo- 
thripsis. Elegant statistical analysis led to the proposal that chromo- 
thripsis could involve the “shattering” and reassembly of a 
chromosome, with “loss” of some segments*. However, the molecular 
mechanism or even the feasibility of such an event was not clear. Our 
analysis directly establishes that a chromatid can indeed be fragmen- 
ted, with fragments distributed between daughter cells. In addition to 
validating the shattering and reassembly mechanism, our findings 
also explain the segmental DNA loss that characterizes many exam- 
ples of chromothripsis: loss may simply occur by partitioning of chro- 
mosome segments into a daughter cell that does not expand and does 
not contribute to the final population. Chromothripsis has also been 
suggested to originate from DNA replication errors that generate 
MMBIR’. MMBIR could be an independent mechanism causing 
chromothripsis, or an additional contributing factor. In agreement 
with this latter possibility, we detect short, potentially “templated” 
insertions at a minority of translocation junctions that are consistent 
with co-occurring MMBIR”. 
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The data here establish that the rupture of micronuclei during S 
phase is one important source of mutagenesis. However, as we prev- 
iously proposed”, there may be more than one defect in micronuclei 
that generates DNA damage. Intact micronuclei exhibit reduced or 
delayed DNA replication and also fail to normally accumulate several 
DNA replication and repair proteins (ref. 11 and A. Spektor, 
E. Jackson & D. Pellman, unpublished data). If the short insertions 
we observe at translocation junctions result from replication defects 
and MMBIR*, given that nuclear envelope rupture seems to terminate 
most nuclear activity’*, MMBIR probably occurred in the intact 
micronucleus before rupture, or after reincorporation of the micro- 
nucleus into a daughter cell nucleus. Because micronuclei replicate 
asynchronously with the primary nucleus, many cells enter mitosis 
with micronuclei that are still undergoing DNA replication’’”*. This 
results in premature chromosome compaction, which is proposed to 
cause DNA breakage, best documented at chromosome fragile sites”. 

Our findings define a new mutagenesis pathway that generates a 
spectrum of localized chromosomal rearrangements, some of which 
have all the features of chromothripsis. Consistent with other recent 
work'**°*!, the results here show that mitotic chromosome segrega- 
tion errors can be heavily mutagenic, which has important implica- 
tions for how mitotic errors and the accompanying aneuploidy might 
contribute to cancer or other human diseases. Mitotic chromosome 
segregation errors occur frequently, resulting in 1-5% aneuploidy in 
normal tissues in mice’. Chromothripsis is reported in a few percent 
of human cancers**** and in rare human congenital disorders®. 
However, the actual rate of chromothripsis is likely to be much higher 
because most events are expected to compromise cellular fitness, 
and these events would only be detected by single-cell analysis”. 
Furthermore, we find that DNA damage from micronuclei can lead 
to a moderate degree of rearrangement that might not, ex post facto, be 
recognized as related to chromothripsis. Micronuclei may therefore be 
an important, but previously unappreciated, source of genetic variation. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Cell culture and treatment. U2OS, telomerase-immortalized RPE-1 cells 
(ATCC), and all derivative cell lines generated for this study were tested for 
Mycoplasma, and were grown in phenol red-free DMEM:F12 media containing 
10% EBS, 100 IU ml’ penicillin, and 100 pg ml~' streptomycin. All cells were 
maintained at 37°C with 5% CO, atmosphere. For cell cycle synchronization and 
induction of MN, cells were treated with 100 ng ml’ nocodazole (Sigma) for 6 h. 
Mitotic cells were collected and washed three times with fresh medium containing 
10% FBS before plating. To arrest cells in G0, nocodazole-treated cells were 
washed 3 times with media containing 0.01% FBS following mitotic shake-off 
and then re-plated in media containing 0.01% FBS. After 3 h, cells were placed ina 
serum-free DMEM:F12 media. For EdU incorporation experiments, cells were 
incubated in the presence of 10 mM EdU from the time of mitotic shake-off. 
Generation of cell lines. Lentivirus or retrovirus carrying genes of interest were 
generated by transfection of 293FT cells with the appropriate packaging plasmids 
(Lentivirus, pMD2.G and psPAX2; Retrovirus, pUMVC and pVSV-G) using 
Lipofectamine 2000 (Life Technologies), according to the manufacturer’s instruc- 
tions. RPE-1 cells were infected for 16-24 h with virus in the presence of 10 jig 
ml! polybrene, washed, and allowed to recover for 24 h before selection with an 
antibiotic or by fluorescence cell sorting. 

RNA interference. Sequence information of the small interfering RNA (siRNA) 
pools used from Dharmacon are as follows, Human TP53 ON-TARGETplus 
SMARTpool siRNA L-003329-00-0005, (J-003329-14) GAAAUUUGCGUG 
UGGAGUA, (J-003329-15) GUGCAGCUGUGGGUUGAUJ, (J-003329-16) 
GCAGUCAGAUCCUAGCGUG, (J-003329-17) GGAGAAUAUUUCACCC 
UUC. Cells were transfected with 40 nM siRNA using Lipofectamine 3000 trans- 
fection reagent (Life Technologies) per manufacturer’s instructions. 

DNA constructs. Plasmid encoding cDNA for H2B-GFP was obtained from 
Addgene (Plasmid 11680). Constructs encoding TDRFP-NLS and NLS-eGFP 
were a gift from A. Salic. Emerald GFP-Gam was a gift of S. Rosenberg. 
Reagents and antibodies. DNA damage was detected using phospho-histone 
H2A.X (Ser139) antibody for y-H2AX (1:300-500, Cell Signaling Catalog 
Number 2577S). Nocodazole was purchased from Sigma-Aldrich. Secondary 
antibodies used were Alexa Fluor 488 (green), 594 (red) and 647 (far red) 
(1:1,000, Life Technologies). 

Detection of EdU incorporation. Detection of EdU incorporation was per- 
formed using Click-iT Plus EdU Alexa Fluor imaging kits 594 and 647 (Life 
Technologies) per manufacturer’s instructions. 

Live cell imaging, single-cell isolation and daughter cell separation. RPE-1 cells 
expressing H2B-GFP and GFP-NLS were treated as described above to induce 
micronuclei after depletion of p53 by siRNA. After mitotic shake-off, cells were 
re-plated and allowed to progress into G1 phase for 4 h. Afterwards cells were 
trypsinized and single-cell sorted into 384-well \1Clear plates (Greiner) using 
FACS. Following single cell sorting, cells were incubated for 2 h to allow for cell 
attachment and spreading. Plates were mounted on a Nikon TE2000-E2 inverted 
microscope equipped with the Nikon Perfect Focus system. The microscope was 
enclosed within a temperature- and CO>-controlled environment that main- 
tained an atmosphere of 37°C and 3-5% humidified CO. Wells containing single 
cells of interest were identified manually and fluorescence and differential inter- 
ference contrast images were captured every 30 min with a 20X NA 0.5 Plan Fluor 
objective for up to 48 h or until the majority of cells had progressed through 
mitosis. All captured images were analysed using NIS-Elements software. 

Wells containing cells of interest, having completed mitosis, were washed with 

PBS and cells were subsequently trypsinized. After addition of an excess of fresh 
medium, daughter cells were separated by limited dilution into new wells in a 
fresh 384-well Clear plate. Successful separation and transfer into new wells was 
monitored using a fluorescence microscope. In cases where both daughters ended 
up in the same well, separation by limited dilution was repeated. After separation, 
the cells were left to attach for up to 4 h before cell lysis. 
Indirect immunofluorescence. Cells were washed in PBS and fixed in 4% para- 
formaldehyde for 20 min; cells were then extracted in PBS-0.5% Triton X-100 for 
5 min, washed 3 times with PBS, blocked for 30 min in PBS containing 3% BSA 
(PBS-BSA) and incubated with primary antibodies diluted in PBS-BSA for 60 
min. Samples were washed 3 times for 5 min with PBS-0.05% Triton X-100 and 
primary antibodies were detected using species-specific fluorescent secondary 
antibodies (Life Technologies). Samples were washed 3 more times for 5 min 
with PBS-0.05% Triton X-100 before DNA detection (2.5 ig ml! Hoechst). For 
pre-extraction, cells were washed once with PBS, and then incubated in CSK 
buffer (100 nM NaCl, 300 mM sucrose, 3 mM MgCl, and 10 mM PIPES pH 
6.8) containing 0.5% Triton X-100 for 5 min on ice. Cells were then washed 3 
times with PBS, fixed and processed as above. 
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Image acquisition and analysis. Immunofluorescence images were collected 
with a Yokogawa CSU-22 spinning disk confocal system with Borealis modi- 
fication, which was attached to a Nikon Ti-E inverted microscope (Nikon 
Instruments, Melville, NY). Laser excitation of the fluorophores was performed 
sequentially using the 405 nm, 488 nm, and 561 nm and 642 nm lasers. 
Images were acquired using a 60X Plan Apo NA 1.4 oil objective with a 
CoolSnapHQ2 CCD camera (Photometrics). Acquisition parameters, shutters, 
filter positions and focus were controlled by Metamorph software (Molecular 
Devices), which was also used for image analysis. Regions of interest were defined 
for micronuclei and corresponding primary nuclei, and average fluorescence 
intensities were determined. Background values were subtracted from a region 
the same size and shape as the micronucleus, set equidistant from the 
primary nucleus. 

Thresholds were set to exclude cells that were not synchronized after mitotic 
shake-off. In G1 phase samples, cells were excluded if the primary nucleus con- 
tained visible EdU. In S or G2 phase samples, cells were excluded if there were low 
levels of replication in the primary nucleus, operationally defined by an average 
fluorescence intensity threshold. Additional average fluorescence intensity thresh- 
olds for y-H2AX were used to exclude rare cells from all samples (<5%) where the 
primary nucleus had significant DNA damage, and to define y-H2AX positive 
micronuclei. For EdU detection, micronuclei were scored as EdU positive if their 
EdU signal was>3 standard deviations above the mean background. 

Each immunofluorescence experiment included two biological replicates. 

Number of cells counted (N) for each experiment were as follows (biological 
replicate 1, biological replicate 2, total): Fig. la: primary nuclei (83, 83, 166), 
intact micronuclei (53, 50, 103); ruptured micronuclei (51, 51, 102); Extended 
Data Fig. 1a: G1 (63, 69, 132), S (55, 50, 105), G2 (62, 63, 125); Extended Data Fig. 
1b: y-H2AX (—) micronuclei (87, 63, 150), y-H2AX (+) micronuclei (109, 73, 
182); Extended Data Fig. 1c: GO (63, 63, 126), S (53, 50, 103); Extended Data Fig. 
1d: y-H2AX (+) micronuclei (62, 49, 111). 
Multi-strand displacement amplification and library construction of single- 
cell genomic DNA. We chose to amplify single-cell genomes by multi-strand 
displacement amplification (MDA) with the Phi-29 polymerase* for four main 
reasons: first, MDA gives better overall genome coverage than PCR-based 
methods” and also gives comparable uniformity’ *’ to other methods such as 
MALBAC”*; this is required for the detection of chromosomal rearrangements. 
Second, Phi-29 polymerase has the highest processivity and the lowest error rate 
among existing polymerases***’. Third, amplification bias due to MDA has been 
characterized as largely random, even between the two homologues of the same 
chromosome” this enables us to estimate the coverage for each homologue, the 
detection sensitivity for de novo variants, and to accurately calculate the copy 
number for each homologue from the coverage at heterozygous sites. Finally, the 
high processivity of Phi-29 polymerase consistently generates large amplicons 
above 10 kb***’; this enables us to perform Sanger sequencing on the MDA 
product after PCR to generate phasing information of rearrangements and val- 
idate their association with the mis-segregated chromosome, which is crucial in 
establishing the relationship between chromosomal rearrangements and DNA 
damage in the micronuclei. 

DNA from isolated cells was subject to MDA following lysis using the REPLI-g 
Single Cell Kit (Qiagen) with minor modification. (Note that we achieved 
the best overall coverage uniformity with this latest version of REPLI-g from 
Qiagen as compared to earlier versions of REPLI-g or the RepliPhi enzyme 
from Epicentre. Comparison of the coverage and uniformity of the single-cell 
libraries in the current study with previous studies*** is summarized in 
Supplementary Table 1 and in ref. 20.) Samples were washed once with PBS 
and cells were lysed using 10 pil of a 1:1 mixture of the provided lysis buffer 
and PBS by a brief vortex and spin down, followed by a 10 min incubation at 
50°C and then an additional 10 min at room temperature. Lysis was stopped by 
adding 5 ll of stop solution with vortex and spin down, followed by incubating at 
room temperature for another 10 min. Whole-genome amplification by MDA 
was carried out in a total of 50 pl for 80 min at 30°C. Purified genomic DNA (for 
bulk RPE-1) or amplified DNA (from single cells) was sheared to 300-500 base 
pair size and used for multiplex genome sequencing libraries as previously 
described'*. 

Library quality was assessed by low-pass sequencing ~0.1 X on a MiSeq instru- 
ment (Illumina). DNA libraries that passed MiSeq quality control'* were then 
sequenced to ~5X per cell on the HiSeq platform (Illumina). Two samples, one 
from the control group (N3) and one from the MN group (MN4), were subject to 
additional sequencing to a total depth of ~9X per cell for validation of the 
enrichment of chromosomal rearrangements on the mis-segregated chromo- 
somes. For the MN9 daughters the plus cell did not divide but the minus cell 
divided twice. The four progeny cells from the MN9 minus daughter were 
sequenced to ~6X combined coverage and the plus cell that did not divide 
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was sequenced to ~4X depth. (The main goal of this experiment had been to 
obtain biological replicates of cells with potentially chromothriptic damage; how- 
ever, in this example the plus daughter with rearranged Chr. 8 did not divide). For 
the three N6 progeny, one cell divided and the other did not before library 
preparation, all three progeny were sequenced to ~4% depth. 

Processing of single-cell sequencing data. Sequencing reads were aligned to the 
human genome reference (hg19/GRCh37) using bwa (http://bio-bwa.source 
forge.net/) in the paired-end mode by “bwa mem”. For both primary and sup- 
plementary alignments, duplicated sequencing fragments were removed by 
MarkDuplicates from the PICARD software suite (http://picard.sourceforge.net/). 
Wealso generated genotype information by running the UnifiedGenotyper mod- 
ule from GATK (https://www.broadinstitute.org/gatk/) at HapMap SNP sites 
(ftp://gsapubftp-anonymous@ftp. broadinstitute.org/bundle/current/b37/hapmap_ 
3.3.b37.vcf) across all sequenced samples (bulk, controls, MN samples). We iden- 
tified 884,944 heterozygous sites in the bulk sample sequencing with quality scores 
= 100. Genotypes at these heterozygous sites were later used to calculate haplotype 
coverage and copy number. 

DNA copy number analysis from sequencing read depth. The average read 
depth was calculated by dividing the number of properly aligned read pairs (insert 
size <1 kb and having the ‘forward-reverse’ pair orientation) by the total 
ungapped length of each bin (for binned coverage) or by the ungapped length 
of the relevant chromosomal arm (for arm-level coverage). The read depth was 
then normalized by the median value of all bins (or arms) for each sample to 
calculate the average DNA copy number of each bin. 

We corrected for GC-content amplification bias in each library by modifying 
the strategy described in ref. 44. The overall strategy was to statistically infer read 
depth variation due to GC content and normalize regional coverage by a GC- 
content-dependent correction factor. Importantly, we expected the dominant 
GC-dependent variation to have been generated during MDA before the library 
preparation; therefore, the average GC-content was calculated for 25 kb bins 
because the average amplicon size of MDA was previously estimated to be in 
the range of 10-50 kb”. 

To generate a reference for the relationship between read depth and GC con- 
tent, we analysed the sequence coverage of a chromosome (Chr. 6) that was 
present in all samples at two copies for each library. For each sequencing 
fragment (read pair) we calculated the average GC-content in the 25 kb bin 
centred at the leftmost alignment position of the read pair. We then grouped all 
read pairs into GC strata that differed by 1% GC content and calculated the 
average read depth for each stratum. A normalizing weight was then derived for 
each GC stratum to bring the median coverage within each GC stratum to the 
same value. Each sequencing fragment was then assigned a weight based on the 
GC composition in its 25 kb neighbourhood. The GC corrected sequencing 
coverage was generated by random sampling based on the ‘GC-correcting’ 
weight for each read. 

Although GC correction at the amplicon level does reduce the amplification 
noise at the sub-megabase scale, significant variation at longer scales (>1 Mb) is 
still evident after GC correction. To additionally correct for long-range systematic 
amplification biases, especially in the small chromosomes (most evident in the 
19p arm, Fig. 3b and Supplementary Table 2), we normalize the average copy- 
number CNi, for arm a in sample i as 


median [std(log, CN, ) 
std (log, CN, ) 


x= (log,CN‘,—log,CNz) x 


The first term on the right side of the equation is the standard log, copy number; 
CN, is the average log, arm-level copy number across all samples (but excluding 
samples with significant gains or losses defined as 10% difference from the med- 
ian). Subtracting the average eliminates systematic (recurrent) amplification bias 
at the arm level (most noticeably on Chr. 19). The second term corrects for 
variation in amplification bias across different chromosomes: the denominator, 
std(¢), is the standard deviation that reflects variability in amplification for a given 
chromosome arm occurring between the different, independently generated, 
samples; the numerator is the median of the standard deviation across all chro- 
mosomes. The standard deviation is small for large chromosomes but becomes 
bigger for small chromosomes. The normalized log,-scale copy number was 
plotted in Fig. 2. 

The genome-wide coverage of each single-cell library is presented in 5 Mb bins 
in the CIRCOS plots in Fig. 3b and Fig. 4a. The standard deviation of the normal- 
ized bin-level coverage was estimated to be ~0.1 from the disomic Chr. 6 for all 
daughters, except for sample MN7 (s.d. ~0.3). The thresholds for significant 
gains (mean coverage ratio> 1.35) and losses (mean coverage <0.6) were chosen 
because they are ~3.5X standard deviations from the mean value (0.95) for 
MNI1-MNé6, MN8, MN9, and for coverage>1.4 and coverage <0.5 for MN7, 
corrected for the additional variance. Where there were arm-level copy number 


alterations present in single cell samples, but absent from the bulk sequencing, 
haplotype copy-number analysis (below) was used to discriminate systematic 
amplification bias (affecting both haplotypes) from true copy number gains or 
losses (Extended Data Fig. 3). 

Detection of loss-of-heterozygosity. The model that the chromosome in a rup- 
tured micronucleus undergoes de novo damage (that is, alterations that are not 
shared between daughters) has the clear prediction that the damage should occur 
on one of the homologous chromosomes but not the other. Our sequence-read- 
depth based copy number analysis indicated that some mis-segregated chromo- 
somes are distributed between the daughters in ~2:1 ratio (Fig. 2a, left panel). 
This implies that the cell with two copies of the affected chromosome contains the 
potentially damaged chromosome from the micronucleus and that the other cell 
should only contain the intact homologue of the mis-segregated chromosome 
(that is, monosomic). A further prediction, which would independently validate 
these copy number results, is that the daughter with two copies of the chro- 
mosome should be heterozygous at polymorphic sites whereas the other daughter 
should be hemizygous. To test this prediction, we developed a method to deter- 
mine the presence or absence of loss-of-heterozygosity (LOH) from the genotypes 
observed at heterozygous sites. 

Because the genotype data are derived from single-cell libraries that are subject 
to variable amplification, it is non-trivial to distinguish true LOH from incom- 
plete coverage due to uneven amplification by the MDA procedure. To address 
this problem, we derived an expected relationship between the observed number 
of sites showing heterozygosity and the observed number of sites showing the 
reference base (or equivalently, the number of sites showing the alternate base), 
given the sequence coverage. This relationship was derived with the assumption 
that the cell has one copy of each homologous chromosome (that is, ‘1:1 hetero- 
zygosity’). Knowing the expected heterozygosity at the desired level of coverage, 
we can then infer LOH, if the sequencing data deviate significantly from what is 
predicted for 1:1 heterozygosity. Because reference or alternate bases are counted 
as ‘present’ or ‘absent’, and the number of reads when present is ignored, this 
relationship is relatively insensitive to variable amplification and holds as long as 
the two homologues are amplified independently”. 

For a cell with a single copy of each homologous chromosome, assume that 
there are a total of M heterozygous sites and the fractional coverage of each 
homologous chromosome above a certain sequencing depth is p. (For example, 
when we required = 3 reads to call a structural variant, p would correspond to the 
fraction of each homologue covered with = 3 reads, which defined the detection 
sensitivity discussed later under Estimation of SV detection sensitivity section.) If 
the two homologues have identical copy number and their amplification bias is 
independent, then the expected percentage of sites where we should observe both 
the reference and the alternate bases (that is, both homologues are covered at or 
above the specified depth) is given by p’. If the observed percentage of sites 
showing heterozygous coverage deviates from this prediction, then it implies that 
the two homologues have different copy numbers. In particular, it is straightfor- 
ward to see that when there is complete LOH for a given chromosome, the 
expected percentage of sites where we should observe both the reference and 
the alternate bases should be zero, if there are no genotyping, sequencing or 
systematic amplification errors. However, to determine if there are partial dele- 
tions in either homologue, we need to estimate the value of p. 

Estimating the fraction of a homologue that should be covered at any given 
depth by the sequencing data, p, is straightforward if the haplotype phases (that is, 
the order of reference and alternate bases at heterozygous sites in a single hap- 
lotype) of both homologues are known. However, this can be achieved even 
without knowledge of the haplotype phase as long as each homologue has a 
complete copy and their average coverage is equivalent, which is true for MDA 
when the coverage is evaluated over regions that are significantly larger than the 
amplicon size’. We can account for p as follows. For a total of M heterozygous 
sites, assume there are fM reference and (1 — f)M alternate bases in haplotype 1. 
(It will become obvious below that the haplotype composition f will not affect the 
estimate for p as long as the amplification of different homologues is independ- 
ent.) For this haplotype we expect to observe the reference base at p X fM 
heterozygous sites that are covered by more than the threshold number of 
sequence reads. For the other haplotype, there are (1 — f)M reference and fM 
alternate bases, which by the same reasoning predicts that we should observe the 
reference base at p X (1 — f)M heterozygous sites above the threshold of coverage. 
Thus, without separating the two haplotypes, we expect to observe pfM reference 
bases from the coverage of haplotype 1 and p(1 — f)M from the coverage of 
haplotype 2, which gives a total number of pM reference bases at all heterozygous 
sites. The same number is expected for the total number of alternate bases being 
covered by more than the threshold number of sequence reads. Note that this 
estimation for p comes directly from the coverage of the reference or alternate 
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bases at the heterozygous sites but does not rely on knowledge of the haplotype 
phase or its composition f. 

In summary, for a chromosome that consists of two single-copy homologues 
(‘1:1 heterozygosity) and having a total number of M heterozygous sites where 
the two homologues differ, we expect the following coverage at heterozygous sites: 

The number of sites showing reference base coverage: pM 

The number of sites showing alternate base coverage: pM 

The number of sites showing heterozygous coverage: pM 

With these expected values we can derive an expected relationship between the 
fraction of heterozygous coverage relative to the average fractional coverage for 
the reference and for the alternate bases: 


[fraction of sites with heterozygous coverage| P 


(ie of sites _ /2 : ae of sites | /2) 2p? 


reference coverage alternate coverage 


which holds for the ‘1:1 heterozygosity’ scenario. Deviation from this ratio in the 
observation reflects changes in the copy number of either or both homologues. 
We therefore define the ratio on the left hand side of the above equation evaluated 
from the experimental data as a ‘heterozygosity coefficient’ to gauge the deviation 
of the homologue copy number from 1:1 heterozygosity. 

Importantly, we do not require direct knowledge of the absolute fractional 
coverage (p) to test whether the ratio on the left side of the equation gives the 
expected value close to 1. (We will however use this relationship below (section on 
detection sensitivity) to estimate p as a measure of the detection sensitivity.) It is 
based only on the relative value for the fraction of heterozygosity divided by the 
fraction of sites with reference (or of alternate) coverage. This relationship is 
therefore more robust than a direct test of the observed coverage on a given 
chromosome against p estimated from a different disomic chromosome, because 
the coverage p may vary slightly between different chromosomes due to sequence- 
specific amplification bias. 

Results shown in Extended Data Fig. 2 confirmed that for heterozygous chro- 
mosomes the heterozygosity coefficient is overall slightly above but close to unity, 
reflecting only a small degree of correlation for the amplification of the homo- 
logues”’. For chromosomes that are completely or partially hemizygous (that is, 
one homologue has complete or partial loss) the heterozygosity coefficients are 
significantly smaller. Of note, the heterozygosity coefficient is a sensitive method 
for detecting even a relatively small segment of heterozygosity in a chromosome 
that is otherwise near completely hemizygous. This is illustrated by the analysis of 
the MN2 minus daughter, where we could detect a 15 Mb segment of hetero- 
zygosity from Chr. 2q (~7% of Chr. 2) that is lost from the plus daughter 
(Extended Data Fig. 8a). 

The method is not, however, sensitive for detecting gains above 1:1 hetero- 
zygosity, as occurs when cells undergo 3:2 segregation patterns for the mis-seg- 
regated chromosome. For this circumstance, we directly calculated the haplotype 
copy number based on the haplotype phase information (see below). Note that 
the determination of the haplotype phase required the LOH analysis: The hap- 
lotype phase was obtained from the sequencing data of chromosomes that are 
inferred to be completely hemizygous by their heterozygosity coefficients, an 
inference that is independent from and more robust than the read depth-based 
arm-level copy number data shown in Fig. 2b and Supplementary Table 2. 
Haplotype copy number analysis. If the haplotype phase is known, then the copy 
number ratio of the two homologues, based on their haplotypes, can be directly 
determined in a given cell. Because the LOH analysis provided confidence of 
genuine hemizygosity (Extended Data Fig. 2), we were able to use the sequencing 
data to directly extract the haplotype phase for one homologue and then infer the 
phase of the other, and deconvolute the sequence coverage for each homologue 
(Extended Data Fig. 3a). 

This analysis was first applied to Chr. X where we observed recurrent gains in 
the sequence coverage of Xq that is nonetheless absent in the bulk sequencing data 
(Supplementary Table 2). As can be seen from the example of the haplotype copy 
number for Chr. X in the MN3 daughters (Extended Data Fig. 3b, top panel), the 
haplotypes were present in equal copy number in both daughters, enabling us to 
attribute the read depth variation (between the Xp and the Xq arms) to systematic 
amplification bias. Variable penetrance of this amplification bias presumably 
caused the apparent copy-number asymmetry in the Cl and MN7 daughters 
(Fig. 2b), which were the samples with the most read depth noise. Indeed, hap- 
lotype copy-number analysis confirmed that there was no difference in the rela- 
tive copy number of the two Chr. X homologues in both C1 daughters (Extended 
Data Fig. 3b, middle panel). It also demonstrated a true gain of one Chr. X 
haplotype was shared in both MN7 daughters (Extended Data Fig. 3b, bottom 
panel). 

By contrast with the amplification bias, haplotype copy-number analysis 
verified that there was a gain of a single homologue of the mis-segregated 
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chromosome in the plus daughters from the 3:2 segregations (MN7 and MN8, 
Extended Data Fig. 3c), excluding the possibility that the gain is due to mixed 
segments from both homologues or uniparental trisomy. Moreover, this analysis 
confirmed that the minus daughters from these cells had a 1:1 ratio of the two 
homologues. Therefore, the haplotype copy number established the segregation 
model depicted in Fig. 2a (right panel) for the 3:2 segregation samples. 

Knowledge of the haplotype phase was also critical for obtaining the fine 
structure of copy number alterations. For the daughters with a 2:1 distribution 
of the mis-segregated chromosome, knowing the haplotype phase enabled a 
digital readout for whether a segment of that haplotype was deleted or retained 
(Fig. 4b, Extended Data Fig. 8b). Like the LOH analysis, this readout is minimally 
affected by amplification noise: heterozygosity is scored as present or absent and 
differences in the numbers of reads at different heterozygous sites are ignored. 
(To capture copy number gains, for example, in the 3:2 segregations, the read 
depth signal is needed and amplification bias will have some effect on the detec- 
tion of focal gains. We are currently working on methods to address this 
problem.) 

We used the haplotype phase information for Chr. 3 (inferred from the MN3 

minus daughter) to generate segmented copy number profiles for the mis-segre- 
gated haplotype in both MN4 daughters that contained interspersed deletions 
(Fig. 4b). Each phaseable SNP served as a probe for the altered haplotype; the 
boundaries of deletions were identified by a transition from ‘covered’ (that is, 
present) probes to ‘uncovered’ probes or vice versa. For each SNP, we provision- 
ally designated it as being ‘covered’ if there were reads supporting the haplotype, 
and ‘uncovered’ if there were zero supporting reads. This raw signal at each SNP 
was then smoothed using the coverage of other SNPs within the local bin to 
eliminate false negative assignments of SNPs not being covered because of amp- 
lification bias. We set this bin size to 100 kb because it is much larger than the 
typical MDA amplicon and therefore unlikely to have been completely lost 
because of variation in amplification’’. In the rare regions with sparsely distrib- 
uted SNPs, we extended the bin to contain at least 8 flanking phaseable SNPs, 
which was chosen because with 8 phaseable SNPs, the probability that half of 
them (4 out of 8) were absent due to variable coverage (false LOH) was less than 
0.1% based on the current level of variant detection sensitivity and heterozygous 
coverage (Supplementary Table 1). (The probability of having sequencing or 
amplification errors generating false heterozygosity in 4 out of 8 sites is even 
smaller.) For each SNP, if more than 50% of the nearby SNPs were covered, the 
SNP of interest was designated as covered; if that value was less than 50%, that 
SNP was designated as being not covered; if the local coverage was 50%, the SNP 
was designated as being at the boundary of a deletion. This process was iterated 
until convergence; the final designation for each SNP can only be ‘covered (1)’, 
‘uncovered (0)’, or ‘boundary (0.5)’. We then connected all SNPs with the same 
copy number states and identified copy-number changes directly. This procedure 
was used to generate the segmented copy-number profiles for MN4 daughters in 
Fig. 4b, with the average coverage for each bin shown as grey dots. 
Detection of chromosomal rearrangements. Chromosomal rearrangements 
were detected from clusters of discordant read pairs. Read pairs were designated 
to be discordant if both mates mapped to the reference genome and the inferred 
insert size exceeded 20 kb, irrespective of the orientation of the mapped pair 
mates. The 20 kb threshold is significantly longer than the average length of 
the sequencing fragment (~500 bp) but is comparable to the typical size of the 
amplicon of multi-strand displacement amplification. The choice of a 20 kb 
threshold thus excludes a large number of short-range artificial chimaeras 
(mostly of inverted orientation) that result from amplicon annealing or phi-29 
polymerase template switches*’*°. As explained below, intrachromosomal rear- 
rangements within the 20-100 kb range were also populated by artefacts and we 
used the frequency of these events on control chromosomes to estimate the 
background frequency of MDA-generated chimaeras. The control chromosomes 
included all the chromosomes from the daughters derived from non-micronu- 
cleated mothers (C1-C4 and N1-N6) as well as all chromosomes from both 
daughters from the micronucleated mothers, with the exception of the mis-seg- 
regated chromosomes. 

We initially applied a low threshold of two discordant read pairs to search for 
candidate rearrangements. Discordant pairs include both primary discordant 
pairs, where the two pair mates are aligned to discordant loci, and split reads, 
where a single read is split into two parts and aligned to non-contiguous seg- 
ments. A primary discordant pair can contain one mate that itself constitutes a 
split read; in this scenario, the discordant pair and the split read were counted as a 
single discordant fragment. Although split read alignment potentially provides 
base-level breakpoint resolution, the smaller size of each split alignment means 
that they are more likely to be misaligned to the reference genome due to short 
interspersed repeats. We therefore required each discordant cluster to consist of 
at least one primary discordant pair with each mate aligned to discordant loci with 
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a mapping quality of 30 or greater (from bwa mem, ~0.1% alignment uncer- 
tainty). From this first-pass analysis there were a total of 8,179 clusters of dis- 
cordant pairs in all samples. 

Next, we excluded all discordant clusters that consisted of reads from daughters 

with different mothers, or reads from the bulk RPE-1 line. Such clusters could 
reflect recurrent artefacts due to inaccuracy in the alignment or in the reference 
assembly. Although expected to be infrequent, we also cannot exclude the 
possibility that some of these clusters reflect clonal or subclonal chromosomal 
rearrangements that accumulated during cell culture. The remaining 2,665 non- 
recurrent clusters are expected to include both putative de novo chromosomal 
rearrangements and random artefacts due to MDA. There were 2,403 intrachro- 
mosomal and 262 interchromosomal clusters that consisted of two or more 
discordant read pairs. There were 1,088 intrachromosomal and 76 interchromo- 
somal clusters with three or more discordant read pairs. Discordant clusters 
supported by only two discordant read pairs occurred at a higher frequency than 
predicted by the haplotype coverage and were especially enriched in short-range 
events (data not shown). Because of the expectation that amplification artefacts 
(excepting those that occur very early in the reaction) would be supported by 
fewer discordant read pairs, we expected more frequent artefacts among the 
structural variants (SV) supported by only two discordant read pairs and required 
putative rearrangements be supported by at least three discordant pairs and also 
by at least one split read. We relaxed these criteria only in a few exceptions where 
there was additional supporting evidence, such as for the chained rearrangements 
depicted in Extended Data Fig. 7b, but subsequently performed PCR validation of 
all the detected events. 
Enrichment analysis of chromosomal rearrangements in the mis-segregated 
chromosome. Random MDA-derived artificial chimaeras and true de novo 
chromosomal rearrangements cannot be distinguished by the variant allele fre- 
quency or by the number of discordant pairs. We therefore developed statistical 
criteria to determine if clusters of rearrangements on the mis-segregated chro- 
mosome in the plus daughter cells were significantly enriched relative to the 
background observed for other chromosomes. 

Template switch events in multi-strand displacement amplification can result 
in frequent short-range chimaeras of the inverted orientation’; by contrast, long- 
range artificial chimaeras or de novo chromosomal translocations are not 
expected to have a preferred orientation. The observed SV events in the single- 
cell DNA libraries indeed confirmed this prediction (Extended Data Fig. 5a). 
A substantial enrichment of short-range SVs (20-50 kb) of the inverted type 
relative to the non-inverted type was observed in all control chromosomes: both 
daughters from the non-micronucleated mothers (C1-C4, N1-N6) and all chro- 
mosomes, excluding the mis-segregated chromosomes, in the daughters from the 
micronucleated mothers. We therefore excluded short-range SVs from the list of 
candidate de novo rearrangements, even though this likely underestimates the 
numbers of true de novo events. 

To rigorously establish a threshold to distinguish short-range from long-range 
SVs, we looked at the cumulative distribution of inverted and non-inverted type 
SVs relative to the breakpoint distance (Extended Data Fig. 5b). By simple visual 
inspection it is clear that at short breakpoint distances, inverted-type SVs are 
more frequent than non-inverted type SVs. However, the frequencies of inverted 
and non-inverted type SVs become equivalent at longer breakpoint distances. 
Power law fitting revealed that events of the intermediate length range (85 non- 
inverted type SVs in the range of 200 kb ~ 5 Mb) followed a power law scaling, 
P(no. SVs = 1) = 1°, By contrast, short-range inverted-type SVs (632 inverted 
type SVs <100 kb) decayed as ~ I~", reflecting a much higher frequency of 
inverted-type SVs at shorter breakpoint separations. The intersection of these two 
decay curves indicated that at or above 150 kb breakpoint distances, inverted and 
non-inverted type SVs approximately followed the same distribution. This ana- 
lysis led us to adopt 150 kb as an operational cut-off for ‘long-range’ rearrange- 
ments that excludes most systematic MDA artefacts. We note that this cut-off 
does not attenuate signal from the mis-segregated chromosomes, because these 
chromosomes had a highly significant enrichment of rearrangements separated 
by>500 kb breakpoint distances (Extended Data Fig. 5c). 

We then estimated the frequency of short- and long-range (including inter- 
chromosomal) rearrangements on all control chromosomes to establish the back- 
ground rates of these events (Extended Data Fig. 4a). The frequency of 
breakpoints within each chromosome arm was calculated as the ratio of the total 
number of observed breakpoints divided by the length of the relevant arm, cor- 
rected for the copy number of that arm and for the detection sensitivity for each 
sample (that is, the percentage of each homologue that is covered at or above the 
threshold depth for variant detection). The arm-level copy number was deter- 
mined as described above and shown in Supplementary Table 2. The detection 
sensitivity estimated from sequence coverage (to be discussed further below) is 
shown in Supplementary Table 1. Short-range rearrangements occurred at an 


average frequency of 1.5 events per 100 Mb; long-range rearrangements occurred 
at an average frequency of 0.5 events per 100 Mb. By a 7’ test, the distribution of 
either short-range or long-range events on the control chromosomes was indis- 
tinguishable from that expected for a uniform frequency with normally distrib- 
uted error (minimum P = 0.18 for short non-inverted type rearrangements, 3 
degrees of freedom). 

To determine if the observed structural variants follow a uniform distribution, 
we performed one-sided Poisson tests of the number of SVs observed in the test 
chromosome against the background frequency estimated from all chromosomes 
including the test chromosome on a per-sample basis. In 8 out of 9 daughter cell 
pairs from micronucleated mothers, there was a significant enrichment of SVs 
(combining short- and long-range events) on the mis-segregated chromosome 
(Extended Data Fig. 4a). By contrast, there was no enrichment for short-range 
rearrangements (Bonferroni corrected P > 0.3) on the mis-segregated chromo- 
somes (Extended Data Fig. 4b, middle panel). Moreover, the enrichment of long- 
range SVs on the mis-segregated chromosome was even more marked, the most 
extreme case being the MN3 pair, with an estimated P value <10— 100 Meanwhile, 
the only exception, MN5, served as a negative control for the statistical test. 

We also performed a Fisher’s exact test of the observed number of long-range 
and short-range events on the mis-segregated as compared with the remaining 
chromosomes, assuming the short-range events represent an empirical sampling 
of background events (summarized in Extended Data Table 1). The Fisher’s exact 
test also confirmed the enrichment of long-range events in the mis-segregated 
chromosome relative to the rest of the genome with the background given by 
short-range events. MN5 was the only example where the partitioning of a chro- 
mosome (Chr. 7) into a micronucleus appeared not to have led to detectable 
rearrangements. 

Finally, we also tested the enrichment of long-range rearrangement break- 

points in each of the normally segregated chromosomes in every sample. The 
background frequency was estimated by the average among all control chromo- 
somes in each sample, that is, for all chromosomes in each control daughter pair, 
and for all but the mis-segregated chromosome(s) in each micronucleated daugh- 
ter pair. No normally segregated chromosome or chromosomal arm reached the 
significance level of P = 0.05 after Bonferroni correction (results for the statistical 
tests are not shown but raw data are in the Supplementary Tables). 
Estimation of SV detection sensitivity. Whole-genome amplification bias 
results in varied coverage across each chromosome. Even at the same locus, this 
coverage can be different between the homologous chromosomes. With a 
requirement of = 3 reads to support a structural variant, we will only detect 
SV events when the sequence coverage at the sites of rearrangements exceeds 
this threshold. Importantly, this threshold is required for reads from a single 
homologue. (For example, even at loci that are covered by more than three reads, 
it is possible that by random chance all the reads were derived from amplification 
of the intact homologue, and the variant homologue is missing from the sequen- 
cing data.) Therefore, the SV detection sensitivity is equivalent to the expected 
fraction, p, of each chromosome homologue that is covered, in this case by three 
or more reads, which can be estimated from coverage at heterozygous sites as 
discussed above in the section on LOH detection. 

For chromosomes that have a single copy of each homologue, the fraction of 
one homologue that is covered by three or more reads is equal to the fraction of 
heterozygous sites at which we observe the reference base (or equivalently, the 
alternate base) by three or more reads. Because there can be subtle variation in the 
amplification of different chromosomes, we generated a per-cell reference for 
detection sensitivity, using the coverage at heterozygous sites in Chrs. 5 and 6 
consistently. These chromosomes were chosen because the LOH analysis 
(Extended Data Fig. 2b) and the read-depth analysis (Fig. 2b) indicated that both 
homologues were present at a 1:1 ratio in every cell that we analysed (1:1 hetero- 
zygosity). 

Three metrics of haplotype coverage are reported in Supplementary Table 1: 
the fraction of genome coverage (% of heterozygous sites covered with = 1 read), 
the fraction of heterozygous coverage (% of heterozygous sites covered with = 1 
read corresponding to each genotype), and the detection sensitivity for variants 
[(% of sites covered by = 3 reads of the reference base + % of sites covered by = 3 
reads of the alternate base)/2]. 

Based on the latter metric, we estimate that at the current sequencing depth 
(~5X), we should detect 30-40% of all de novo structural rearrangements that 
occur on a single homologue. For two samples, N3 and MN4, we doubled the 
sequencing depth to ~9X per cell and the estimated detection sensitivity corre- 
spondingly reached 60-70%. For example, in the N3 pair, the total number of 
rearrangements detected from the entire data set was 17, whereas at half the 
depth, 10 events were detected. Similarly, in the MN4 pair, the total number of 
rearrangements detected from the entire data set was 38 for the mis-segregated 
chromosome and 26 for the remainder of the genome; at half the depth, we 
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observed 21 events in the mis-segregated chromosome and 15 events in the rest of 
the genome. These observations support the validity of our estimation of the 
detection sensitivity. Importantly, at higher sequencing depths, the statistical 
significance associated with the enrichment of rearrangement events in the 
mis-segregated chromosome only became more pronounced. Thus, although 
we are almost certainly underestimating the absolute numbers of rearrangements 
at the current level of sequencing coverage, we can nevertheless support with 
confidence the conclusion that there is enrichment of rearrangements on the mis- 
segregated chromosome. 

Knowing the detection sensitivity is also important for the statistical inference 
that the observed unique SVs in each daughter cell are genuine, de novo events 
rather than a random sampling of shared events that are incompletely detected. If 
we assume that there are a total of N SVs that are shared between the two cells and 
the detection sensitivity is p, in the plus daughter, and px in the minus daughter, 
then the number of SV events that are expected to be detected in both cells is 
Npaps. Moreover, the number of SVs that are unique to cell A is Npa(1 — pg), and 
the number of SVs that are unique to cell B is Npp(1 — pa). Even if we do not know 
the total number of events, N, the relative fraction of shared versus unique events 
can be derived from the detection sensitivity. We therefore performed a multino- 
mial test of the number of events observed in each of these categories (shared, 
unique to cell A, unique to cell B) against these ratios. Based on the results of the 
multinomial test, the hypothesis that the observations are due to incomplete 
detection can be rejected except in MN2 (Extended Data Fig. 4c, MN5 does 
not show enrichment of rearrangements in the mis-segregated chromosome). 
Rearrangement validation by PCR. The junction sequence for each rearrange- 
ment was constructed from the reference sequence and the breakpoint coordi- 
nates of the partner loci. For each rearrangement two primer pairs were designed 
spanning the rearrangement junction: one pair for the rearranged sequence, the 
other for the reference sequence (Extended Data Fig. 6, primer data available 
upon request). When adjacent SNPs were present, PCR primers were designed to 
incorporate these sites to generate genotype information on the rearranged and 
the wild-type products. PCR was performed in both daughters: 10 ng of whole- 
genome amplified DNA was used per reaction and subject to 35 cycles of PCR. 
The product was gel purified and Sanger sequenced in both directions to validate 
the rearrangement and to infer the haplotype associated with each product. 
Association of rearrangements with the gained haplotype. If rearrangements 
result from alteration of a single chromatid, then they should all be associated to 
the same haplotype. Furthermore, a strong prediction of the model that rearran- 
gements occur from damage in micronuclei is that all rearrangements detected in 
the mis-segregated chromosome should be associated with the gained haplotype. 

Haplotype phasing (associating rearrangements with a specific haplotype by 
the genotypes at polymorphic sites) was done in two ways. If a polymorphic site 
was close enough to a rearrangement junction (that is, within the size range of an 
average DNA sequencing fragment), we looked for sequencing reads that both 
covered the polymorphic site and supported the rearrangement (either the read 
pair was a discordant pair or one pair mate was a split read supporting the 
rearrangement). We also performed long-range PCR (~1 kb) on the MDA amp- 
lified DNA to generate a product that would include the rearrangement junction 
and incorporate one or more polymorphic sites that were further away and could 
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not be phased by sequencing reads directly. Sanger sequencing of this product was 
then used to test for association of the rearrangement with either haplotype. 
Importantly, Sanger sequencing can demonstrate that the rearranged product 
was derived only from the gained haplotype. PCR of the reference sequence 
was also performed on DNA from both daughters to determine the haplotype 
associated with the wild-type product. 

The PCR validation strategy and representative examples are illustrated in 
Extended Data Fig. 6 and results of the PCR validation and haplotype association 
are summarized in Extended Data Table 1 and Supplementary Table 5. In each 
case where we PCR-validated a rearrangement with an informative nearby poly- 
morphism, the rearranged product showed that the rearrangement was linked to 
the gained haplotype at one or more polymorphic sites. In addition, in all cases, 
the PCR validation of the wild-type products in the minus daughters always 
showed the genotype from the intact haplotype. For 2:1 segregations, this poly- 
morphic site was hemizygous, consistent with the segregation model. 

Interestingly, the reference sequence PCR product in the plus daughters was 
hemizygous in some events and heterozygous in others (Supplementary Table 5, 
Extended Data Fig. 6d). In each case where this product was hemizygous, the 
genotype corresponded, as expected, to the intact haplotype, that is, the normally 
segregated chromosome. In other cases where the reference sequence product was 
heterozygous, the presence of both rearranged and reference sequence products 
with the gained haplotype is most easily explained by a DNA break at one side of a 
DNA replication fork (Extended Data Fig. 6e). This result is consistent with the 
cytological evidence suggesting partial DNA replication (Fig. 1). 
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Extended Data Figure 1 | DNA damage and double-strand breaks occur in 
micronuclei when replication is coincident with nuclear envelope rupture. 
a, Nuclear envelope rupture in G1 is not sufficient to induce DNA damage in 
micronuclei. Left, graph shows the percentage of ruptured micronuclei, 
determined by the loss of GFP-NLS, that were positive for y-H2AX 

(>3 standard deviations above the mean of S-phase primary nuclei), and the 
fluorescence intensities for y-H2AX labelling in the indicated samples. 

N > 100 from two experiments for each time point (see Methods). Error bars, 
standard error of the mean. Right, images of representative cells with 
micronuclei highlighted in boxes. b, Ruptured micronuclei have double-strand 
breaks detected by GFP-Gam. Left, graph shows the percentage of micronuclei 
with one or more GFP-Gam positive foci in y-H2AX-positive and negative 
micronuclei in S phase. N > 100 from two experiments for each category 
(see Methods). Right, images of a representative cell with a damaged 
micronucleus (highlighted in boxes) and an intact micronucleus. Inset, 
magnification of GFP-Gam signal. c, Nuclear envelope rupture of micronuclei 
in GO phase cells does not result in significant DNA damage. Micronucleation 
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was induced in RPE-1 cells by a nocodazole block-and-release protocol, and 
cells were released into serum-depleted (GO) or serum replete medium (S) for 
17 h. Left, graph shows the percentage of micronuclei that were positive for 
y-H2AX (>3 standard deviation above the average level in S-phase primary 
nuclei) as well as the distribution of the fluorescence intensities for y-H2AX 
labelling in the indicated samples. Percentage of ruptured micronuclei was 
from a parallel sample with a GFP-NLS-expressing RPE-1 line. N > 100 
from two experiments for each time point (see Methods). Right, images of 
representative cells. Error bars show standard error of the mean. d, The 
majority of damaged micronuclei have initiated DNA replication. Replication 
was detected by continuous EdU labelling following nocodazole release and 
integrated EdU signal normalized over nuclear or micronuclear area. Left, 
percentage of y-H2AX positive micronuclei that were positive (>3 standard 
deviation above the background) or negative for EdU. N > 100 from two 
experiments (see Methods). Error bars show standard error of the mean. 
Right, images of a representative cell. Inset, over-exposure to visualize low-level 
EdU labelling of the micronucleus. 
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Extended Data Figure 2 | Determination of loss-of-heterozygosity. 

a, Cartoon comparing the expected single-cell sequencing coverage of 
heterozygous and hemizygous chromosomes at polymorphic sites. Loss-of- 
heterozygosity (LOH) can be inferred from the scarcity of heterozygous 
genotypes without knowing the haplotype phase (that is, the genotypes at 
polymorphic sites for each homologue). The presence or absence of the 
reference or the alternate base provides a digital read-out of heterozygosity or 
LOH that is insensitive to read-depth noise common in single-cell sequencing 
data. This can be quantified as a heterozygosity coefficient, the ratio of the 
observed fraction of heterozygosity relative to the expectation for a 
heterozygous chromosome consisting of a single copy of each homologue (‘1:1 
heterozygosity’). For a diploid cell with 1:1 heterozygosity, if the fraction of sites 
that are covered (=1 read per site) for each homologue is denoted as p, then the 
expected fraction of sites with heterozygous coverage is approximately p~. If the 
chromosomes are equally amplified then p ~ 1/2(observed reference base + 
observed alternate base)/total sites (Methods). b, Heat map of the 
heterozygosity coefficients for all chromosomes from all single cell samples 
included in Fig. 2b plus two additional single cells (‘singletons’) with 
monosomies that were sequenced to identify the haplotype phase of the 
monosomic chromosomes. Near complete LOH in the MN1, MN3, MNS, and 
MN6 minus daughter cells independently confirmed the monosomy of the mis- 
segregated chromosome, as determined from DNA copy number analysis 
(Fig. 2b). Note that the MN2 and MN6 daughters had monosomies (Chr. 18 in 
MN2 and Chr. 9 in MN6) shared in both daughters, indicating that the 
monosomy was pre-existing in the mother cell. Two ‘singletons’ were identified 
as having monosomy in Chr. 4 and in Chr. X based on low-pass MiSeq 
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sequencing. They were each sequenced to ~4.5X to generate the haplotype 
phase of Chr. 4 and Chr. X. Note that the second singleton also had 
monosomies in Chrs. 15 and 16; the haplotype phases for these two 
chromosomes were not used in the current study, so they were omitted from the 
table in c, but the data are available upon request. c, Table summarizing results 
from the loss-of-heterozygosity analysis. Heterozygosity coefficients are shown 
for the boxed chromosomes in b, red heterozygosity coefficients indicate 
complete LOH; orange are partial LOH. For the cell ID, we indicate the 
individual daughters as ‘“MN#(a)’ or ‘MN#(b)’. The cases denoted as ‘“MN#(a/b)’ 
are monosomies shared in both daughters. The third column lists the number 
of heterozygous sites detected for the indicated chromosome from the bulk 
sequencing data (Methods). Columns 4-6 summarize the number of sites at 
which sequencing coverage from each cell supports the presence of reference 
bases (‘Ref.’), alternate bases (‘Alt.’), or both (‘Het.’). The last column lists the 
heterozygosity coefficients calculated for the indicated chromosome in the 
specified cell or cells. The average heterozygosity coefficient is 1.08 for all 
chromosomes that did not show LOH from all samples (last row). For 
chromosomes with near complete LOH (rows 1-8) the small number of 
heterozygous sites is likely due to genotyping errors (for example, duplicated/ 
homeologous sequences on the same chromosome) or amplification and 
sequencing errors. The incomplete LOH in the MN2 and MN4 daughters 
results from the reciprocal distribution of a fragmented chromosome between 
the two daughters (Extended Data Fig. 8). Note that our calculated 
heterozygosity coefficient can sensitively detect even a small region of 
heterozygosity in the MN2 minus cell (Extended Data Fig. 8a). 
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Extended Data Figure 3 | Calculation of haplotype copy number from 
phased haplotypes. By sequencing monosomic cells we were able to determine 
the haplotype phases for chromosomes 1, 3, 4, 18, and X (Extended Data Fig. 2). 
a, Cartoon illustrating the strategy to calculate the copy number for each 
haplotype (homologue) from the coverage at individual polymorphic sites. The 
example is shown for a cell having one copy of each homologue. The left shows 
the aggregate sequence data; the right shows the deconvolution of the sequence 
based on the determined haplotype phase. The haplotype coverage is calculated 
by dividing the number of sequencing reads from the indicated haplotype by 
the total number of heterozygous sites. Copy-number alterations affecting only 
one homologue can be directly identified by calculating the ratio between the 
two homologues. Note that this approach is robust to any recurrent 
amplification bias that equally affects both homologues. b, Validation of copy- 
number alterations in Chr. X. The haplotype phase was inferred from the 
sequence of a singleton cell with monosomic Chr. X. The DNA copy number 
analysis alone suggested frequent Chr. Xq gains shared in many daughter pairs 
(including all controls, Supplementary Table 2). We considered it unlikely that 
these inferred copy number alterations were genuine because they were not 
present in the bulk sample (Supplementary Table 2). We calculated the 
haplotype copy number to distinguish true copy-number alterations from a 
potential systematic amplification bias for Chr. X. Each dot represents the 
haplotype copy number calculated as the average coverage at all sites within 
each bin for which phase information could be obtained (that is, where there 
was coverage in the reference cell that we sequenced with Chr. X monosomy). 
Haplotype copy number of Chr. X in 0.1 Mb bins in MN3 confirmed that the 
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small yet significant gain in Xq relative to Xp affects both haplotypes equally, 
thus excluding the Xq gain as a genuine copy number alteration. Initial read- 
depth-based copy number analysis (Supplementary Table 2, Methods) also 
implied that there could be a copy number asymmetry for Chr. Xq for the two 
daughters from sample C1 as well as for Chr. X between the MN7 daughters 
(Fig. 2b). Calculation of the copy number of Chr. X for 0.25 Mb bins in C1 
confirmed that there is no difference between the two haplotypes in Xq; thus, 
any variation in the combined sequence coverage likely also reflects 
amplification bias. By contrast, the copy number of Chr. X in MN7 (1 Mb bins) 
indicated that there is a true gain of a single haplotype that generates trisomy for 
Chr. X that is shared (pre-existing) in both daughters. c, Use of haplotype copy 
number to validate the 3:2 segregation patterns in the MN7 and MN8 daughters 
inferred from DNA copy number. The haplotype phase for Chr. 1 was 
determined from the sequence of the MN1 minus cell. Coverage of the intact 
haplotype (blue dots) is evenly distributed between the daughters and 
throughout the chromosome; the mean coverage of this intact haplotype was 
used to calculate the normalized copy number of the mis-segregated/gained 
haplotype. For the MN7 daughters, there is a single copy gain of Chr. 1q in the 
plus daughter. By contrast, there is no gain of Chr. 1p in either cell, providing an 
internal control. In MN8, nearly an entire copy of Chr. 4 was gained in the plus 
daughter, with the exception of two segments partitioned into the minus 
daughter. The gains and losses both occurred only to the mis-segregated 
haplotype (orange). The reciprocal gain and loss of these segments in the two 
daughter cells illustrates the sensitivity of the method. 
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Extended Data Figure 4 | Statistical analysis of the enrichment of structural 
variants on mis-segregated chromosomes. a, No enrichment of structural 
variants was observed in control chromosomes, including all chromosomes 
after the division of non-micronucleated mothers plus all normally segregated 
chromosomes after the division of micronucleated mothers. Shown are the 
frequencies of short- and long-range chromosomal rearrangements, at inverted 
(‘head-to-head/tail-to-tail’) or non-inverted (‘head-to-tail/tail-to-head’) 
orientations, detected in all control chromosomes, plotted for each 
chromosome arm. The frequencies were calculated by dividing the total 
number of rearrangement breakpoints of each type for each chromosome by 
the total length of the chromosome and after correcting for copy-number 
alterations for each chromosome and the detection sensitivity in each sample. 
Fluctuation around the mean value across the genome is insignificant for all 
groups (P > 0.05, ¢ test for a normal distribution based on the observation). 
b, Enrichment of structural variants specifically on the mis-segregated 
chromosomes identified by asymmetric copy number. Top, frequency of all 
structural variants (breakpoints per Mb, normalized for DNA copy number 
and detection sensitivity) detected in the mis-segregated chromosomes (both 
plus and minus cell combined) as compared with all the remaining, normally 
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segregated chromosomes; middle, frequency of intrachromosomal SVs with 
breakpoint distance <150 kb, showing no enrichment; bottom, frequency of 
long-range SVs (intrachromosomal SVs with breakpoint distance>150 kb or 
joining different chromosomes). P values are derived from a one-sided Poisson 
test (Methods). c, Mutually exclusive distribution of SV breakpoints between 
the two daughters. There are three categories of events, those unique to cell ‘a’, 
those unique to cell ‘b’", and those shared between both cells. The frequencies in 
each category can be estimated given the detection sensitivity in each library, 
which can be inferred from the sequence coverage at heterozygous SNPs 
(Methods). By a multinomial test, the large number of SVs detected in the mis- 
segregated chromosome that are unique to each daughter cell cannot be 
explained by incomplete detection of pre-existing SVs shared between the cells. 
This contrasts with the few shared events (one each in MN1 and in MN3; none 
in the others). This conclusion holds for all daughter cell pairs except those 
from the MN2 and MN5 mothers. For the MN2 daughters the small number of 
evaluable events does not reach statistical significance. The MN5 daughters are 
the one negative example where there do not appear to be chromosomes that 
underwent any significant rearrangement. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a inverted type non-inverted type 
(head-head/tail-tail) (head-tail/tail-head) 


p53 Knockdown p53 Knockdown + Noc. release p53 Knockdown + Noc. release 


Micronucleated cells 


Non-micronucleated cells excluding the mis-segregated chrs. 


oe -36 “ 
405 p~107 200 4 f 10 300 + p~10~“8 
y 250 4 pf 
304 150 + 
200 4 
p=0.001 
20 7 y 100 + p~10* 150 + pene 
y 100 7 gp 
107 50 7 
50 4 
0-2 | | _I Oe ll. o- | i off HL De 0- [| | Oa li oll 
> e° ¥ AS) we x? $° AS) we x? $° 
s £ £ gg ra s - s Ss ro s ¢ s Ss ro 
er Ss 7 Sy er & 7 & ef & 7 < 
b 
All control chromosomes 
~w [1-18 
500 7 p~10% 
400 4 x 
300 + a 
p~1075 5 * non-inverted type 
200 + y a - inverted type 
+ 
100 4 
o| UN Un os lt on 10° . cer eee 
4 5 6 7 8 9 
o 3° me 10 10 10 10 10 10 
aS) KS Pa $s 2 l 
v Ss O 7 < 
c All mis-segregated chromosomes 
120 5 
0 10° 
100 + 
8074 ~ 
2 
60 4 o 
a 
= 0.01 
iG "" 0.018 z 
2074 | 
o! UN Om oll oll 40° - 
RS RS Rs eC 10 10 10 10 10 10 
PPT FSF Sf 1 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 5 | Length distribution of structural variants detected 
in single cells. a, Short-range rearrangements in control samples are enriched 
for the inverted orientation. The number of structural variants (SV) detected in 
three groups of controls broken down by the distance between the rearranged 
sequences. Left, p53 knockdown cells (C1-C4); Middle, p53 knockdown + 
nocodazole release (N1-N6); Right, micronucleated cells (MN1-MN9), but 
excluding the mis-segregated chromosome. Note a significant enrichment of 
events in the inverted orientation is observed in the 20-150 kb range in all three 
groups (binomial test, P< 0.05), but there is no such enrichment for SV events 
with breakpoint distances exceeding 150 kb or joining different chromosomes. 
The enrichment of short-range inverted type rearrangements can be explained 
by the fact that Phi-29-based multi-strand displacement (MDA) reaction used 
to amplify DNA generates frequent short-range inverted chimaeras”. b, Scaling 
relationship between the frequency of SVs and the distance between 
breakpoints in control chromosomes. Left, combined frequency of SVs from all 
control chromosomes from a above. Right, the cumulative distribution of 
inverted (red dots) or non-inverted (black dots) SV events as a function of 
breakpoint distance. Note, the graph shows the accumulation from infrequent 
long-range rearrangements (starting on the right) to more frequent short-range 
rearrangements (finishing on the left). We expect that genuine rearrangements 
should equally favour inverted or non-inverted orientations and attribute the 
bias towards the inverted-type to MDA artefacts. Thus, as a filter for potential 
MDA artefacts, we used power-law scaling to identify the breakpoint distance 
above which inverted and non-inverted type rearrangements occur with 
approximately equal frequency. In the 20-100 kb range, the data for inverted- 
type rearrangements (632 events total) were best fit by a power law decay of 
—1.176 (+0.006, 95% confidence interval) and for the non-inverted type 
rearrangements (85 events total) by a power law decay of —0.395 (+0.005). 
This difference is lost in the range of 200 kb-5 Mb, where the data for inverted- 
type rearrangements (53 events total) were best fit by a power law decay of 
—0.337 (£0.02) and for the non-inverted type events by a power law decay of 
—0.31 (£0.005). The power law fitting for the inverted-type events in the 20- 
100 kb range and the power law fitting for events in both groups in the 200 kb-5 
Mb range (—0.322 +0.007) intersected at ~150 kb. This established an 
operational cut-off of 150 kb to define ‘long-range’ rearrangements, above 
which there should be no enrichment of inverted-type MDA-generated 
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artificial chimaeras. c, Scaling relationship between the frequency of SVs and 
the distance between breakpoints in mis-segregated chromosomes. Left, 
frequency of SVs from all mis-segregated chromosomes. By contrast with the 
controls in b, there is a marked enrichment of long-range SVs in this group with 
breakpoints>500 kb apart. Indeed, the frequency of long-range 
rearrangements in these samples is substantially higher than the frequency of 
short-range rearrangements. The fact that these SVs are concentrated on the 
mis-segregated chromosome in the plus daughter cell (Fig. 3b, Extended Data 
Table 1) suggests that these are genuine rearrangements of the under-replicated 
chromosome from the micronucleus. Considering only the mis-segregated 
chromosome, we find a less pronounced difference in the ratio between 
inverted and non-inverted type rearrangements, even for short-range events. 
We speculate that this occurs because the mis-segregated chromosome also has 
a high frequency of genuine short-range rearrangements that are not biased to 
be in an inverted orientation. These numbers could also reflect a smaller sample 
size of rearrangements or possibly be due to rearrangement of the damaged 
chromosome before amplification, altering the relationship of the starting 
sequence relative to the reference genome. By establishing the 150 kb cutoff to 
filter for artefacts, we thus likely exclude some genuine short-range events. 
Finally, we note that our power law scaling analysis for SVs detected in the mis- 
segregated chromosomes is consistent with other independent estimates for 
the likelihood of intrachromosomal contacts. The power law dependence of 
~I-°"° (power decay —0.105 £0.003 from all events in the range of 150 kb 
~5 Mb) is equivalent to a density distribution of p(/) ~ 1 '1°. This dependence 
is close to the length distribution of somatic copy number alterations (~/~') 
observed in cancers*®, and also to the distribution of breakpoint distances for 
somatic chromosomal rearrangements (J. Wala & R. Beroukhim, personal 
communication). Moreover, it is consistent with the probability of 
intrachromosomal contacts (~!‘°*) inferred from Hi-C experiments’”. The 
distribution of the long-range breakpoint distances from the mis-segregated 
chromosomes shown here (150 kb ~ 10 Mb) is significantly different from the 
distribution of rearrangements from all control chromosomes, shown above in 
b (P = 0.0043, Kolmogorov-Smirnov test). By contrast, pairwise comparisons 
of the distribution of the control samples (a, above) showed no significant 
differences (P = 0.6 for all events, P = 0.9 for long range events, K-S test). 
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Extended Data Figure 6 | Validation of rearrangements by PCR and 
haplotype phasing. a, PCR validation of 26 (out of 66) intrachromosomal 
rearrangements detected in Chr. 3 of the MN3 plus daughter. The complete 
results for all samples are summarized in Supplementary Table 5 with examples 
of the Sanger sequencing results shown below (c, d). For each rearrangement, 
two PCR reactions (“RA’ across the putative rearrangement junction, ‘O’ for the 
reference sequence) were performed on the MDA amplified DNA from both 
daughter cells. For each sample indicated above the gel, the leftmost lane is the 
PCR product for the rearrangement-specific primers in the minus daughter; the 
left middle lane the rearrangement-specific primers in the plus daughter; the 
right middle lane, the reference-specific primers in the minus daughter; and the 
right lane, the reference-specific primers in the plus daughter. In ‘RA2’ where 
there are heterozygous SNPs on both ends of the rearrangement junction, PCR 
was performed to generate (and have validated) the genotypes at both sites 
(‘OT2’ and ‘OB2’). 17 out of 26 PCR products confirmed the rearrangement 
junction after Sanger sequencing. The remaining 9 PCR reactions generated no 
product or products that did not correspond to the rearrangement sequence; 
several of these were due to low sequence complexity near the rearrangement 
junction that resulted in non-specific primer pairs. b, Cartoon showing the 
strategy for validating rearrangements and associating these rearrangements 
with the mis-segregated haplotype. Forward PCR primers were chosen 5’ from 
an informative heterozygous site and reverse primers were chosen 3’ of the 
rearrangement breakpoint junction, either in the rearranged DNA sequence or 
in the reference DNA sequence. PCR was performed to amplify both the 
rearranged product and the control reference genome product, followed by 
Sanger sequencing. Here the undetermined base at the polymorphic site is 
colored grey. c, Example of haplotype validation for a rearrangement in MN3 
based on an adjacent ‘C/T’ SNP. From the minus daughter, we were only able to 
amplify the reference product, which showed a “T” at the polymorphic site. 
Because MN3 underwent a 2:1 segregation, the mis-segregated haplotype was 
inferred to have a ‘C’ at this polymorphic site. From the plus daughter, we 
amplified both the rearranged and reference products. As expected, the 
rearranged showed a ‘C’ at the polymorphic site, indicating that the 
rearrangement occurred on the mis-segregated chromosome. Also as expected, 
there was a “T” at the polymorphic site on the reference product. The base 
associated with the mis-segregated haplotype is in red; the base for the normal 
haplotype is in blue. d, Example of haplotype validation for a rearrangement in 
MN4 yielding three products. In this case, there are two informative 
polymorphic sites near the rearrangement: the “T+ G’ pair is associated with the 
mis-segregated haplotype and the ‘C+T” pair is associated with the intact 
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reference haplotype. As expected, the minus daughter had the reference 
product showing the ‘C+’ haplotype. Also as expected, the plus daughter had 
the rearranged product showing the “T+G’ haplotype as well as the reference 
product showing the ‘C+’ haplotype. Somewhat unexpectedly, the plus 
daughter also had a third product, a reference product associated with the 
‘T+G’ mis-segregated haplotype. We speculate that the presence of both 
rearranged and reference products with the mis-segregated haplotype results 
from partial replication of this region of the mis-segregated chromosome. 

e, Proposed partial replication/replication fork breakage model to explain the 
presence of three products detected in d above. Shown on the left is a replication 
fork on the mis-segregated chromosome, in the middle are the products of 
replication or recombination, and on the right are the products with the red bar 
indicating that the products are associated with the mis-segregated haplotype. 
The original DNA strands are in dark red; the newly synthesized ones are in 
light red; DNA from a distal rearrangement partner locus is in blue. We 
hypothesize that breakage of the replication fork (scissors) generates a single- 
end break that recombines with the distal locus shown in blue. The other end of 
the replication fork generates a reference product. Importantly, both products 
should contain the base(s) associated with the mis-segregated haplotype. The 
presence of both rearranged and reference products containing the mis- 
segregated haplotype could also occur if the rearrangement is an artificial 
chimaera that arose during MDA amplification. However, such artefacts 
should not be restricted to a single homologous chromosome: the highly 
significant enrichment of rearrangements on the mis-segregated chromosome 
and their association with the mis-segregated/gained haplotype establish that 
most of these rearrangements are genuine. Notes: 1. Rearrangements were not 
only associated with the mis-segregated haplotype by PCR, but in some cases 
this association could be made directly by sequencing read-based phasing using 
either discordant read pairs or split reads that covered heterozygous SNP sites 
close to either side of the breakpoint. This analysis enabled us to determine the 
haplotype association for 10 events in MN3 (3 in addition to Sanger 
sequencing), 5 events in MN4 (3 in addition to Sanger sequencing), and 6 
events in MN8 (5 in addition to Sanger sequencing), all confirmed to be 
associated with the gained haplotype. 2. For the daughters with a 3:2 segregation 
pattern, even if the plus cell contained an intact copy of the mis-segregated 
chromosome, we expect a replicate of this homologue to be present in the 
minus cell because that copy of the homologue was normally segregated. 
Because we did not detect any rearrangements in the minus cell, the 
rearrangements detected in the plus daughter can only come from the mis- 
segregated copy of the homologue in that daughter. 
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Extended Data Figure 7 | Sequence features at the rearrangement junctions. 
a, Length distribution of microhomology at the junctions of rearrangements in 
control chromosomes (top), in the mis-segregated chromosomes of daughters 
with 2:1 segregations (middle), and in the mis-segregated chromosomes from 
daughters with 3:2 segregations (bottom). The distribution of microhomology 
at rearrangement junctions detected in all control daughters is 
indistinguishable to that detected in the control chromosomes in the 
micronucleated daughters, with ~75% of events showing 0-1 bp homology. By 
contrast, rearrangements in the mis-segregated chromosomes contain a higher 
percentage of microhomology: more than 50% of all events exhibited >1 bp 
homology in every sample. b, Chained translocations between breakpoints on 
Chr. 18 in the MN6 plus daughter. Left, CIRCOS plot for the translocation 
chain in Chr.18; Right, translocation links between 16 breakpoints, 14 of which 
had paired break ends forming a chain. All events were validated by PCR, and 
red links reflect rearrangements that were associated to the gained haplotype 
through nearby SNPs (Supplementary Table 5). c, Examples of short (50-500 
bp) insertions at breakpoint junctions. Insertions are represented as arrows 
pointing along the 5'—>3' orientation of the reference sequence with 
coordinates shown on the right. Dashed links represent read pairs supporting 
the given junction. In addition to insertions derived from the mis-segregated 
chromosome and inserted into rearrangements in the mis-segregated 
chromosome, we identified additional examples as follows: For the MN3 
sample we identified one example of Chr. 3 insertion into a rearrangement 
between two loci in (normally segregated) Chr. 14. For the MN4 sample we also 
identified one example where a short segment from (normally segregated) Chr. 
2 was inserted into a rearrangement between loci in the mis-segregated Chr. 3. 
For the MN8 sample, where both Chr. 4 and Chr. 11 were inferred to have been 
fragmented in the same micronucleus, we identified one example where a 
rearrangement between loci on Chr. 4 contained a 95 bp insertion from Chr. 11 
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in the rearrangement junction and another example of a 279 bp segment 
originated from Chr. 11 inserted into a Chr. 4—Chr. 11 rearrangement junction. 
In MN9, we have identified >20 insertions at sites of long-range 
rearrangements on Chr. 8 via local sequence assembly (Supplementary Table 
6). Here we show one translocation junction containing 8 short segments from 
all over Chr. 8. The segments at the boundaries of the rearrangement are in bold 
outline. Between the boundaries are 8 short insertions (47-433 bps, grey, green 
or purple bars) from different parts of Chr. 8 (cluster 1). Green and purple bars 
indicate insertions originating at or near breakpoints of other rearrangements 
(clusters 2 and 3, correspondingly green or purple). As the average detection 
sensitivity was ~35% for each library (Supplementary Table 1), it is likely that 
some insertions could have been missed. Importantly, the insertions only come 
from distal sites on the mis-segregated chromosome(s) and such insertions are 
not detected in any control samples. Notes: 1. We can only determine the 
presence of insertion sequences <500 bp as most sequencing fragments are 
shorter than 600 bp (99% of fragments are shorter than 600 bp in the DNA 
library of the MN9 plus daughter, <513 bp for the MN1 plus daughter, <380 
bp for the MN3 plus daughter, MN4 daughters, MN8 plus daughter, and <350 
bp for the MN7 plus daughter). 2. The MN9 sample has many more insertions 
than the other samples with the inserted segments frequently derived from 
sequences near other rearrangement breakpoints. The explanation for this is 
not clear and future work will require experiments of a larger sample size. 
However, unlike the MN1-MN8 daughters, which were isolated shortly after 
division of the micronucleated mother, the MN9 plus cell remained arrested for 
a ~2 day period of time while the minus daughter divided twice. We speculate 
that the mis-segregated chromosome in the plus daughter from the MN9 
daughter pair have undergone MMBIR as part of the mechanism that 
combined these Chr. 8 fragments. It is also possible that breakpoint ends in the 
MN%9 plus cell could have been fragmented into small segments. 
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Extended Data Figure 8 | Evidence of chromosomal fragmentation detected 
from haplotype copy-number analyses in three examples, MN2, MN4, and 
MN8B. a, Inference of chromothripsis of Chr. 2 in the MN2 daughters without 
knowledge of the Chr. 2 haplotype phase. Left, CIRCOS plot. Right, Plot of the 
heterozygosity coefficients in 250 kb bins, with rearrangement links indicated 
(above, non-inverted type; below, inverted type). Note that the MN2 daughters 
underwent a 2:1 distribution of the mis-segregated chromosome, implying that 
any chromosomal loss generates loss-of-heterozygosity. The heterozygosity 
plot demonstrates that a pericentric fragment of 2q is partitioned into the 
minus cell, whereas the remainder of Chr. 2 is in the plus daughter. 
Chromosomal rearrangements are only observed in heterozygous regions, 
consistent with heterozygosity originating from the damaged/under-replicated 
homologue from the micronucleus. Each dot represents the heterozygosity 
coefficient in a 250 kb bin (~50 heterozygous sites per bin). Bins with fewer 
than 25 phaseable heterozygous sites or showing only 1~2 observed 
heterozygous sites are not shown. b, Haplotype copy number of Chr. 3 in the 
MN4 daughters (100 kb resolution). Left, CIRCOS plot. Right, Chr. 3 haplotype 
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copy number in MN4 daughters calculated from the chromosomal haplotype 
phase derived from the sequence of the MN3 minus daughter. Each dot 
represents average haplotype copy number in a 100 kb bin: the normally 
segregated haplotype (blue dots) is equivalently detected in both daughters 
whereas the fragmented haplotype (orange dots) shows oscillating and 
reciprocal retention and loss between the two daughters. c, Haplotype copy 
number of Chr. 4 in the MN8 daughters with rearrangement links. The 
haplotype phase for Chr. 4 was inferred from the sequence of a single cell with 
Chr. 4 monosomy (Extended Data Fig. 2c). Gains of the mis-segregated 
haplotype (orange dots) are reciprocal in both daughters; except for one 
rearrangement in the plus daughter, all detected breakpoints, including both 
intrachromosomal events (links) and interchromosomal events with Chr. 11 
(vertical magenta lines), are restricted to regions of gains in the mis-segregated 
haplotype. Red links indicate translocations that are associated with the mis- 
segregated haplotype by informative SNPs near the breakpoints; black links 
indicate rearrangements for which phasing validation was not performed (a 
subset of which have no adjacent SNPs). 
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Extended Data Table 1 | Summary table of the rearrangements detected in the daughters of micronucleated mothers and their enrichment on 
the mis-segregated chromosome 


# of V. # of short- V. 
Sample MS Chr. cell ies en olde nae eve GFShDIEtange SNe Poisson test Fisher’s test ce naplayee 
of MS Chr.| ws. Chr. other Chr.| MS. Chr. other Chr. validation validation 
(+) cell 2 19 4 3 10 
MN1 Chr. 1 <10710 10°12 7/10 3/3 
(-) cell 1 none 5 none 14 
(+) cell 1.7 7 8 0 32 
MN2 Chr. 2 10-10 10° 3/5 2/2 
(-) cell 1.1 2 3 3 40 
(+) cell 2 67 12 4 20 
MN3 Chr. 3 <10-10 10-24 17/26 15/15 +3t 
(-) cell 1 none 1 none 9 
(+) cell 1.5 25 12 3 77 
MN4 Chr. 3 <10710 10-28 10/14 5/5 +3 
(-) cell 1.2 13 13 5 62 
(+) cell 2 1 8 3 17 
MN5 Chr. 7 0.34 0.28 
(-) cell 1 none 6 none 11 
(+) cell 2 5 4 0 21 
MN6 Chr. 18 10-9 104 8/8* 5/5 
(-) cell 1 1 20 0 13 
(+) cell 2.9 8 1 1 9 
MN7 Chr. 1q <10710 107 3/6 ara) 
(-) cell 2.1 none 5 none 17 
+) cell 2.9 36 8 15 51 
wing | Chrs.4 |) <10-10 10-22 7/8 WN +5 
and 11 | (-) cell 2 3 6 4 23 
(+) cell 2.8 4 7 3 29 
MN9 Chr. 8 <10-10 10-28 
(-) cell 2 1 2 1 20 


MS. Chr., mis-segregated chromosome; CN, copy number. P value for enrichment was derived from a one-sided Poisson test of the observed number of rearrangements in the mis-segregated chromosome 
against the average frequency of rearrangements in the whole genome for each daughter pair. Fisher’s test compared the observed numbers of short and long-range rearrangements on the mis-segregated 
chromosome (both daughters) with the corresponding numbers from the remaining chromosomes in the same daughter cell pair. The enrichment for long-range rearrangements on the mis-segregated 
chromosomes was highly significant as compared to short-range events, which had no enrichment. PCR validation confirmed the presence of a PCR amplified product and the mapping of the sequence to 
the two partner loci of the rearrangement (but does not exclude artificial chimaeras); haplotype validation further associated the rearrangement to the gained (mis-segregated) haplotype inferred from haplotype 
copy number analysis (Extended Data Fig. 3; Methods). Validation results are presented as the number of validated cases divided by the number of total attempts; in haplotype validation, the number of attempts 
only includes cases where the informative SNP base can be confidently determined by Sanger sequencing that span the rearranged junctions (Supplementary Table 5). 

*|n the MN6 plus daughter, the standard rearrangement analysis detected five SV events in Chr. 18, three of which were inferred to be in a chain of eight translocations between multiple double-strand breaks. All 
these eight events were validated by PCR and sequencing, with five out of eight associated to the mis-segregated haplotype. In all statistical analysis of this chromosome (Extended Data Fig. 4b, c) we only counted 
the five events to ensure consistency with the estimated detection sensitivity. 

+ We included additional events in the haplotype validation (‘+’) when the Sanger sequencing was unable to generate the haplotype information, but direct phasing of the rearrangement was possible through 
supporting short-reads with an adjacent SNP. In both MN3 and MN4 plus daughters, there was one translocation between the mis-segregated chromosome and a different chromosome (see Extended Data Fig. 
7c): these events were counted as ‘de novo events in the mis-segregated chromosome’ although excluding them does not change the statistical analysis. 
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Small particles dominate Saturn’s Phoebe ring to 
surprisingly large distances 


Douglas P. Hamilton!, Michael F. Skrutskie*, Anne J. Verbiscer? & Frank J. Masci® 


Saturn’s faint outermost ring, discovered in 2009 (ref. 1), is prob- 
ably formed by particles ejected from the distant moon Phoebe”’. 
The ring was detected’ between distances of 128 and 207 Saturn 
radii (Rs = 60,330 kilometres) from the planet, with a full vertical 
extent of 40Rs, making it well over ten times larger than Saturn’s 
hitherto largest known ring, the E ring. The total radial extent of 
the Phoebe ring could not, however, be determined at that time, nor 
could particle sizes be significantly constrained. Here we report 
infrared imaging of the entire ring, which extends from 100Rs 
out to a surprisingly distant 270Rs. We model the orbital dynamics 
of ring particles launched from Phoebe, and construct theoretical 
power-law profiles of the particle size distribution. We find that 
very steep profiles fit the data best, and that elevated grain tem- 
peratures, arising because of the radiative inefficiency of the smal- 
lest grains, probably contribute to the steepness. By converting our 
constraint on particle sizes into a form that is independent of the 
uncertain size distribution, we determine that particles with radii 
greater than ten centimetres, whose orbits do not decay appreciably 
inward over 4.5 billion years, contribute at most about ten per cent 
to the cross-sectional area of the ring’s dusty component. 

In the course of mapping the entire sky, NASA’s WISE spacecraft* 
observed Saturn’s outer Phoebe ring at a wavelength of 22 j1m in June 
2010 (Fig. 1). The ring appears in its entirety, spanning an area of sky 
nearly 7,000 times larger than Saturn itself. Discovered by the Spitzer 
Space Telescope’ at wavelengths of 24 jm and 70 jum, the Phoebe ring 
was recently detected at optical wavelengths by Cassini*. To highlight 
the faintest outer material, we suppress the vertical dimension and 
construct radial traces of the ring in Fig. 2 (Methods). Our measure- 
ments show that the ring extends to at least 270Rs, well beyond the 
moon Phoebe, which traverses the region 180-250Rs. The ring is also 
clearly seen inward to at least 100Rs (Fig. 1a) and to perhaps 50Rs 
(Fig. 1b) before being lost in the glare from Saturn. 

To model the ring’s structure, we follow the orbital motions of dust 
grains of multiple sizes launched from Phoebe at different points along 
its orbit. The most important forces affecting dust in the Phoebe ring 
are solar radiation pressure and the much weaker Poynting-Robertson 
drag’. Both of these forces arise from interactions with sunlight: the 
first due to the absorption of solar photons and the second due prim- 
arily to the slightly asymmetric re-emission of the absorbed energy’*. 
Radiation pressure causes dust grain eccentricities to oscillate with a 
period of approximately 30 years and is important for grains with radii 
s < 100 pm. These grains form a distribution that is offset towards the 
Sun, but still left-right symmetric. Poynting-Robertson drag, although 
extremely weak, imparts an important systematic inward decay 
towards Saturn with a characteristic timescale of 1.5 X 10°(s) years, 
where s is in units of um; hence a 3-cm particle will evolve from 
Phoebe (with semimajor axis a=215R,) inward to the moon 
Iapetus (a = 60Rs) over the age of the Solar System’. 

We use the numerical code dI (dust Integrator’"'') to predict the 
orbits of particles with radii ranging from 4 jum to 10 m released from 
Phoebe. We launch dust grains with radii of 4, 6, 10, 15, 25, 40, 60 and 


100 um and with eight different angular positions of Phoebe’s 
pericentre relative to the Sun, an important parameter when radiation 
pressure is strong. For larger grains, we continue to use five logarith- 
mically spaced sizes per decade in radius, but follow only a single 
launch condition, as the dynamics of large grains are only weakly 
affected by radiation pressure. To speed up the integrations, we 
artificially enhance the rate of Poynting-Robertson drag by a factor 
ranging from 10 to 450. This results in rapid but otherwise identical 
orbital evolution as long as the inward drag timescale remains much 
longer than all other important timescales. We verified that this 
approximation is valid for our integrations. 

We stop the integrations when the dust grains cross the orbit of 
Titan at 20R, because collisions with that massive satellite occur within 
~10,000 years, far shorter than the timescale for inward migration by 
Poynting-Robertson drag, even for the smallest particles. Grains with 
s > 150 ptm remain on fairly low-eccentricity orbits, and so we stop 
those integrations when they reach Iapetus’ orbit—this is an excellent 
approximation since the inward drag timescale exceeds the Iapetus 
collision timescale of a few million years. But since the WISE images 
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Figure 1 | WISE Band 4 mosaic of the Phoebe ring. a, Individual WISE 
frames manually combined: scattered light from Saturn forms the large white 
overexposed blob with a black centre while four diagonal diffraction spikes 
radiate outward. Bright reflections of Saturn are visible as smaller white lumps 
with black centres at the six and twelve o’clock positions. Iapetus (black dot) 
and the more distant Phoebe (white dot) are visible at nine o’clock. The Phoebe 
ring is the white, horizontally oriented 550Rs X 40Rg rectangle. b, We subtract 
+90° rotations of the top frame from itself, yielding clean and cluttered ring 
ansae (the apparent ends of edge-on rings); here we stitch the two clean ansae 
together to significantly reduce scattered light and reveal the ring’s inner 
regions. Distance scale applies to a and b. 
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Figure 2 | Measured radial profiles. We sum over 30 X 40 pixel regions in 
Fig. 1 and subtract a sky signal as described in Methods to obtain one- 
dimensional radial profiles from each side of Fig. 1a and b. A fifth curve is 
derived directly from the WISE Image Atlas. All curves agree to within their 
intrinsic scatter which provides an important consistency check. Accordingly, 
the solid black data points with error bars represent the average and the 
scatter from our five measurements. Ring flux is clearly detectable to at least 
270Rs, well beyond Phoebe’s apocentre distance of 250Rs. All ring profiles 
agree well inward to about 100Rg, at which point scattered light from Saturn 
becomes problematic. 


only provide trustworthy fluxes outside about 100Rg (Figs 1 and 2), the 
comparison of data to theory is rather insensitive to the exact details of 
how the collisional sweeping of Iapetus at 60Rs is modelled. 

Next, for each simulation we match the WISE viewing geometry by 
transforming the numerical data (a list of positions versus time for the 
lifetime of the integration) into a Saturn-centred reference frame that 
rotates so that the Sun always stays along the x axis. We view this 
distribution of dust from the direction of the Sun to produce an arti- 
ficial y-z image on the sky that closely matches the geometry of Fig. 1. 
We sort the evenly-spaced output from our numerical simulations into 
line-of-sight bins, sum over the vertical dimension z, and produce 
predicted profiles of flux versus the y coordinate for each individual 
grain size whose dynamics we model. For the smallest dust grains, we 
average the distributions from the eight different Phoebe launch azi- 
muths to get realistic profiles. Finally, we sum together distributions 
for individual grain sizes with different weighting functions. For sim- 
plicity, we assume that particles are continuously created at Phoebe 
according to a power-law size distribution of the form M(s)ds « s “ds, 
where N(s) is the number of particles of radius s in the range [s, s + ds] 
and q is the power-law index. This procedure correctly weights the 
contribution of all particle sizes to the ring flux by explicitly convolving 
the production function with orbital dynamics including particle life- 
times. The results are theoretical predictions that we normalize to the 
observed profiles of Fig. 2. 

In Fig. 3, we limit particle sizes to the range from 4 tum, the smallest 
grains whose orbits are not all immediately forced by radiation pressure 
onto Titan-crossing orbits, to 100 tm, roughly the largest size where 
radiation pressure still matters. All distributions have some difficulty 
matching the two data points with y < 120Rs. These points are the 
most strongly affected by scattered light from Saturn and by our imper- 
fect subtraction of this scattered light (Fig. 1). Focusing our attention 
on the more distant points, we note that the steepest power-law dis- 
tributions (q = 4-6) fit the data best, highlighting the relative import- 
ance of small particles. Such steep particle size distributions are unusual 
in the Solar System. Nevertheless, steep size distributions of particles 
launched from Phoebe provide a satisfactory fit to the data. 
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Figure 3 | Theoretical radial profiles for 4-100 jum size distributions. We 
plot differential power-law particle size distributions with indices q = 1, 2 
and 3 (dotted lines), 4 and 5 (solid line), and q = 6 (dashed line). A q = 3 
power-law index puts an equal mass of ring particles in each logarithmic size 
interval; q = 2 does the same for surface area. We compute radial profiles 
for individual particle sizes from our numerical modelling and then combine 
them with different assumed power-law production rates, normalizing all 
curves to pass near the data point at 140Rs. The shallow power-law size 
distributions (dotted lines) dominated by large particles are a poor match to 
the data, while steep size distributions more closely match. 


As steep power-law distributions are somewhat surprising, we test 
the effect of the upper cutoff size by raising it above 3 cm, to include 
ring particles that do not decay inward to Iapetus over the age of the 
Solar System. For example, 30-cm debris from Phoebe should be con- 
fined between 160Rs and 250Rs, perhaps in sufficient quantities to 
improve the fit of the shallower power-law distributions in that region. 
In Fig. 4, we extend the size distribution up to 30 cm and find that the 
shallow power-law distributions, which highlight the contribution of 
3-30 cm grains, are dramatically affected. Centimetre-sized material 
produces a ring with a largely empty interior region which, when 
viewed edge-on, leads to diminished flux close to the planet. 

The shallow power-law distributions, however, predict rings that 
end abruptly at 250Rs, Phoebe’s apocentre, in contrast to the data. 
Instead of immediately ruling these distributions out, we instead con- 
sider relaxing another of our model assumptions. A promising 
improvement would be to include Phoebe family members (small 
satellites with Phoebe-like orbital inclinations’?*) as additional 
sources of ring material. Interestingly, these satellites are all more 
distant than Phoebe itself. While the known satellites comprise <1% 
of Phoebe’s cross-sectional area, undiscovered 100-m sized and even 
kilometre-sized objects could raise this percentage significantly, mak- 
ing this population a correspondingly more important source of ring 
material. The most relevant effect of these additional sources on the 
theoretical ring profiles of Fig. 4 would be to raise the predicted ring 
flux outside 250Rs to more closely match the observations. 

Two further assumptions might be relaxed. First, the size distribution 
of particles produced in the ring need not follow a power law, and 
second, the distribution of particle sizes may be modified by mutual 
collisions as the material drifts inward towards Iapetus. Collisions are 
most important for large particles that would otherwise remain in orbit 
for hundreds of millions to billions of years. Accordingly, shallow 
power laws with indices of g = 1-2, or indeed any other distribution 
dominated by large particles, would probably evolve towards a more 
typical collisional distribution with index q = 3-4. For these reasons, we 
continue to favour the steeper size distributions. 
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Figure 4 | Theoretical radial profiles for 4 1m-30 cm size distributions. The 
curves are predictions for assumed differential power-law particle size 
distributions, as in Fig. 3. Shallow size distributions (dotted lines) predict a ring 
dominated by large particles, with diminished flux from the ring’s interior due 
to the reduced grain mobility. A distribution with power index q = 1, with large 
particles producing 90% of the observed ring flux, predicts a peak at 170Rg that 
is clearly inconsistent with the data. By contrast, index q = 2, with 20% of the 
flux from 3-30 cm particles, perhaps adequately matches the data, at least 
interior to Phoebe’s apocentre at 250Rs. Distributions with steeper power-law 
indices still match the data best. 


True particle sizes in the ring, however, are skewed by the enhanced 
infrared emission of the smallest particles. Because the peak wave- 
length of a blackbody thermal emission spectrum (~35 um for 
~80 K) is significantly larger than the smallest grains in the ring, these 
particles emit long-wavelength radiation inefficiently. Thus, the tem- 
peratures of small particles rise, more energy is emitted at shorter 
wavelengths, the emission peak rises and moves closer to the WISE 
22 um band, and the measured flux increases significantly. This drives 
our best fits towards steeper power-law indices, as the fits actually 
represent convolutions of the true particle size distribution residing 
in the ring with the enhanced emission expected from small and rela- 
tively hot dust grains. 

All of these considerations are suggestive of more complicated size 
distributions. A broken power law" with temperature effects included 
for the smallest particles would be a logical next step. However, its 
application is complicated by the fact that the position of the break and 
the change in the power-law index both depend sensitively on assump- 
tions about the radiating efficiency of the probably irregularly-shaped 
ring particles. Furthermore, collisions amongst ring particles and with 
interplanetary debris, sputtering by the solar wind, sand-blasting by 


LETTER 


tiny interstellar dust grains, and other unmodelled loss mechanisms 
are also likely to affect the ring’s current particle size distribution. 
Accordingly, it makes sense to convert the constraints of Fig. 4 into 
a form independent of any assumed size distribution: we find that 
soccer-ball-sized and larger rocks (2s > 20 cm) do not evolve signifi- 
cantly inward over the age of the Solar System and, accordingly, cannot 
account for more than ~ 10% of the observed ring flux. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

In the course of its all-sky mid-infrared survey, the NASA Wide-field Infrared 
Survey Explorer (WISE*) scanned over the position of Saturn 12 times between UT 
2010-06-12.85 and uT 2010-06-13.98. Extended Data Fig. 1 shows a subset of the 
WISE Band 4 (22 kum) ‘Level 1B’ individual frames that captured ring flux and did 
not have significant artefacts from Saturn’s scattered light. Note that the wave- 
length of Band 4 has been revised upward from 22.1 jim to 22.8 jm (ref. 15). WISE 
scanned the sky systematically driven in ecliptic longitude by the steady precession 
of its orbit of about 1° per day, so from orbit to orbit the position of Saturn shifts 
several arcminutes across the 47-arcmin-wide frame. Saturn’s orbital motion of 1.6 
arcmin during the observing period is minor compared with this rate of spacecraft 
precession. Saturn itself spans only a couple of WISE pixels, so scattered light 
dominates the ‘image’ of Saturn on the scale of several arcminutes. In the upper 
two rows of Extended Data Fig. 1, the bright object embedded in the ring east (left) 
of Saturn is Phoebe. The ring flux is faint in the individual exposures (~3 DN per 
pixel at 160 Saturnian radii (Rs) compared with pixel-to-pixel RMS noise from the 
background of 10 DN; DN indicates Digital Number). The ring emission spans 
~50 pixels in vertical extent and is hundreds of pixels wide, so averaging over the 
spatial dimension as well as stacking individuals frames substantially improves the 
accuracy of the flux estimate. The pixel scale of the WISE Band 4 Level 1B images is 
5.5 arcsec per pixel. 

Atlas Images. WISE data products include an Image Atlas that optimally com- 
bines Level 1B individual frames into a final calibrated image registered to the 
celestial reference frame. Given the large spatial extent of the Phoebe ring (~300Rg 
radius or ~40 arcmin), image smearing of the Phoebe ring in the Atlas Images due 
to Saturn’s 1.6 arcmin motion relative to the stellar background is minor compared 
with extent of the ring flux and small compared with the bins used for fitting of the 
ring flux in the main text. Two WISE All-Sky Atlas Image fields contain ring flux. 
In one (1800p030, Extended Data Fig. 2), Saturn lies off the edge of the image and 
the ring flux is largely uninfluenced by the proximity of Saturn. The adjacent Atlas 
Image (1784p030) contains Saturn itself, and the excessive flux from the planet 
caused significant corruption of the faint ring flux and underlying background 
flux, rendering this automatically generated Image Atlas frame largely useless for 
determining ring properties. 

Custom co-additions. Given the availability of the individual Level 1B exposures 
contributing to the atlas images, a selection of Level 1B exposures not significantly 
corrupted by the flux from Saturn itself were offset to compensate for Saturn’s 
motion on the sky and combined with a o-rejected average to produce another 
representation of the WISE ring image (Extended Data Fig. 3). In this case, ring 
flux was reconstructed with fidelity both east and west of Saturn. 

Ring flux estimation. Extended Data Fig. 3 shows the region of the ring, as well as 
background regions above and below the ring, isolated in strips 50 pixels in extent 
in the latitude direction (perpendicular to the ring). The flux in the ring region was 
estimated in a series of 8-pixel-wide bins in the longitude direction by calculating 
the median of the pixel values in each 8 X 50 pixel region while subtracting the 
average of the background medians from similar regions above and below the ring. 
Overall, three ring profiles resulted from extraction from the 1800p030 Atlas 
Images (ring east side) and the east and west sides of the full image in Extended 
Data Fig. 3. These profiles are largely independent, as the most significant noise 
source is the systematic level background offsets resulting from imperfect match- 
ing of the offset level in the individual Level 1B frames. These profiles show no 
evidence of east-west ring asymmetry. 


Subtraction of scattered light from Saturn at small radii. The ring profile 
extractions outlined above suffer from contamination from excess scattered light 
from Saturn at small separations from the planet. Under the assumption that the 
scattered light is symmetric about the origin, additional profiles were constructed 
by rotating the image assembled from the individual Level 1B frames by 90° and 
subtracting it from itself (bottom half of Fig. 1). Doing so reveals structure (includ- 
ing a clear view of Iapetus) hidden previously by scattered light. The fidelity of this 
subtraction is only as good as the assumption that the scattered flux is symmetric, 
since the residual ring profile close to Saturn is a small fraction of the scattered flux 
being subtracted. Extended Data Fig. 4 shows the result of applying this technique 
to the bright point source Alpha Tau from the Band 4 WISE Image Atlas. The 
subtraction is free of significant artefacts that would bias ring flux extraction, and 
reveals subtle background matching offsets at the 0.2 DN level. Figure 2 contains 
an east-side and a west-side ring flux extraction from the rotated self-subtracted 
imagery that partially overlaps the extractions from the wider-field mosaics 
described above. 

Non-uniform backgrounds. Structure in the background arising from zodiacal 
dust emission and galactic backgrounds could bias the ring flux measurements 
described above. Extended Data Figs 5-7 illustrate that the Band 4 background is 
smooth on the scales addressed here, and that the smooth gradient in background 
in the ecliptic latitude direction is small compared with the ring flux on the scales 
that impact the extracted flux. Given that there is a gradient in zodiacal emission 
increasing from north to south, it is possible that the 90° rotation will shift zodiacal 
background inappropriately from one ecliptic latitude to another. Fortunately, 
over the spatial scale of interest the zodiacal emission is nearly constant (becoming 
largely uniform in the lower portion of the image corresponding to high line 
number in Extended Data Fig. 6). 

Assumptions underlying the model fit. As the ring is optically thin, each point 
in the radial profile measured in Fig. 2 contains contributions from ring mater- 
ial orbiting at a range of distances from Saturn. We could, in principle, continue 
to process the data by mathematically removing successive outer layers of ring 
material to determine the material’s intrinsic radial distribution’®’’. This 
approach, however, has two serious disadvantages. First, errors in removing 
outer layers build up to strongly affect results for the inner layers, especially if 
the data are relatively noisy as in these WISE images. But even more impor- 
tantly, the orbits of particles in the Phoebe ring are strongly influenced by 
radiation pressure and hence expected to be highly elliptical. Thus the distri- 
bution of debris in the Phoebe ring deviates from the cylindrical symmetry 
required by standard data reduction techniques. Accordingly, we choose to 
proceed by building up entirely theoretical radial profiles of the Phoebe ring 
to compare directly to the data. 

Sample size. No statistical methods were used to predetermine sample size. 
Code availability. We have opted not to make available the numerical codes used 
to produce Figs 3 and 4, both because the codes were not designed to be easily 
portable and because we anticipate numerous significant upgrades in the next year. 
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Extended Data Figure 1 | Ring flux in a subset of single (Level 1B) WISE shifts Saturn in the ecliptic longitude direction so that the planet moves from 
Band 4 (22 jm) exposures. We organize these nine independent images so _left to right. The vertical ring extent is about 50 pixels and the horizontal extent 
that each row contains three images centred at approximately the same exceeds 200 pixels, so even the single exposures can be binned to yield modest 
ecliptic latitude (vertical direction). For each row, spacecraft orbital precession __ signal to noise ratio on the ring. 
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Extended Data Figure 2 | A portion of the WISE Image Atlas frame right; processing artefacts to the bottom left and right should be ignored. The 
containing ring emission. The image is rotated so that ecliptic north is up and __ring flux (horizontal stripe) is uncontaminated by Saturn’s scattered 
east is to the left. In this Atlas Image, the bright emission from Saturn is light, although the orbital motion of Saturn smears the embedded image of 


outside the WISE field of view, but one of its diffraction spikes is visible at top Phoebe into a bright oval. The faint point sources are distant stars and galaxies. 
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Extended Data Figure 3 | A custom WISE Band 4 mosaic produced from filtering in Saturn’s frame of reference (so that Phoebe, the point source on 
selected Level 1B frames. Each selected Level 1B image was free from the east/left side of the ring, appears unsmeared). Ecliptic north is up and east 
significant artefacts from Saturn’s scattered light. The frames have been shifted, _ is to the left. Green lines highlight the regions used for flux extraction. 

offset to a common background level and stacked with trimmed average pixel 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 4 | WISE Band 4 images of the star Alpha Tau. 

a, Direct image of the star. b, The same image rotated by 90° and subtracted 
from itself mitigating scattered light. The large blob-shaped artefacts up, down, 
left and right from the central star are due to the reflection of starlight from the 
telescope’s internal structure. Significant artefacts from azimuthally 


asymmetric scattered flux near the star, by contrast, are not evident in the 
subtracted image b. Residuals after subtraction arise largely from frame offset 
mismatch and are typically of order 0.2 DN in the image in b (compared with 3 
DN for the ring flux at 160Rs). 
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Extended Data Figure 5 | Examination of background structure. a, A WISE _ extending left and right horizontally from the central white over-exposed image 
Band 3 (12 tm) optimally frame-matched mosaic oriented in ecliptic of Saturn. The backgrounds are largely uniform, especially in Band 4, with 
coordinates with north up and east to the left. b, Band 4 (22 jim) optimally the exception of a north-south gradient characteristic of zodiacal dust 
frame-matched mosaic. The ring is evident in the middle of the Band 4 image, emission. Ring flux is not obviously evident in the 12 11m Band 3 exposure. 
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Extended Data Figure 6 | Quantitative analysis of Band 4 background 
gradient in units of DN. This figure plots the average background level in DN 
(vertical axis) row-by-row (horizontal axis) in the Band 4 image shown in 
Extended Data Fig. 5. The analysis region slightly overlaps the ring flux, which 
appears as the small bump around line 3300 and establishes the ring plane. The 
sense of the rotation used in the 90° subtraction carries flux from line numbers 
3500-4000 into the ring midplane. Because the DN values are so similar, the 
bias introduced by the rotation is no greater than 0.1 DN while the inner 
ring flux is of order 6 DN. 
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Extended Data Figure 7 | Colour composite of the Phoebe ring. Mosaics of _ from Saturn forms the bright white circle at the centre of the image and the ring 
images in WISE Bands 2, 3 and 4 (4.6, 12 and 22 um) comprise the composite _ is the faint horizontal bar that cuts across Saturn. 
image in ecliptic coordinates. North is up and east is to the left. Scattered light 
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Small-scale dynamo magnetism as the driver for 
heating the solar atmosphere 


Tahar Amari’, Jean-Francois Luciani! & Jean-Jacques Aly” 


The long-standing problem of how the solar atmosphere is heated 
has been addressed by many theoretical studies, which have 
stressed the relevance of two specific mechanisms, involving mag- 
netic reconnection and waves, as well as the necessity of treating the 
chromosphere and corona together’ ’. But a fully consistent model 
has not yet been constructed and debate continues, in particular 
about the possibility of coronal plasma being heated by energetic 
phenomena observed in the chromosphere***""'. Here we report 
modelling of the heating of the quiet Sun, in which magnetic fields 
are generated by a subphotospheric fluid dynamo intrinsically 
connected to granulation. We find that the fields expand into the 
chromosphere, where plasma is heated at the rate required to 
match observations (4,500 watts per square metre) by small-scale 
eruptions that release magnetic energy and drive sonic motions. 
Some energetic eruptions can even reach heights of 10 million 
metres above the surface of the Sun, thereby affecting the very 
low corona. Extending the model by also taking into account the 
vertical weak network magnetic field allows for the existence of a 
mechanism able to heat the corona above, while leaving unchanged 
the physics of chromospheric eruptions. Such a mechanism rests 
on the eventual dissipation of Alfvén waves generated inside the 
chromosphere and that carry upwards the required energy flux of 
300 watts per square metre. The model shows a topologically com- 
plex magnetic field of 160 gauss on the Sun’s surface, agreeing with 
inferences obtained from spectropolarimetric observations’” “*, 
chromospheric features (contributing only weakly to the coronal 
heating) that can be identified with observed spicules’ and 
blinkers’®"', and vortices that may be possibly associated with 
observed solar tornadoes’. 

Much work has been devoted to the magnetic properties of the Sun’s 
atmosphere, including the interpretation of the polarization observed 
in some spectral lines'*"* and advanced magnetohydrodynamic 
(MHD) simulations'*"*. Of particular interest is the conclusion”, 
obtained through a plasma diagnostic technique based on the Hanle 
effect in atomic and molecular lines and supported by studies based on 
the Zeeman effect'*’, that the quiet solar photosphere is teeming with 
a topologically complex, small-scale magnetic field which carries a 
substantial amount of magnetic energy density, more than sufficient 
to dominate the overall energy balance of the Sun’s atmosphere. As for 
the MHD studies, some of them’””* describe atmospheric dynamical 
and heating effects resulting from the interaction of imposed pho- 
tospheric motions with magnetic fields computed from longitudinal 
magnetograms of active regions. Other calculations try to include the 
upper part of the convection zone, but they generally find a dynamo 
that is too weak", perhaps due to the numerical difficulty of modelling 
a compressible dynamo in a domain open to an atmosphere. 

We use an MHD model designed to address the problem of the 
heating of the quiet-Sun atmosphere, assuming that the source of 
energy is the above-mentioned small-scale magnetic field. More pre- 
cisely, we deal with the issue of identifying a mechanism for converting 
that source of energy into heat”, but we do not consider here the 


thermodynamic and radiative response of the plasma. In our model, 
the magnetic field is assumed to be created by a subsurface small-scale 
dynamo” operating in the upper 1.5 Mm of the convection zone. This 
thin layer, in which the plasma is taken to obey incompressible 
Boussinesq MHD equations”, is coupled with an atmospheric region 
(described by a different set of MHD equations”’) that comprises a 
photosphere and a chromosphere of respective thickness 500 and 
1,500 km, as in the actual Sun, and a corona extending up to 
15 Mm. The coupling between both regions is performed through 
an interface at which the lower solution provides boundary conditions 
for the upper one”’. The atmosphere is taken to be initially at equilib- 
rium (Extended Data Fig. 1) and its temperature is kept constant in 
time—a reasonable assumption given our specific aim. Technical 
details are given in the Methods section. 

We follow the evolution of the coupled system for a time long 
enough (140 min) to cover many cycles of life and death of the granu- 
lation cells (about 8 min for a cell) that develop in the lower layer and 
induce the dynamo process. When amplification saturates, the self- 
consistent magnetic field we obtain on the solar surface has a very 
significant horizontal component (Fig. la), in contrast with earlier 
current-free results’. We obtain mean values of 160 and 28 G, respect- 
ively, for its strength and vertical component, which is in agreement 
with both observational values inferred from the polarization in some 
spectral lines'*""* and theoretical values found in the most recent com- 
pressible dynamo simulations**”* (see Methods). The field appears to 
be organized at the granulation scale (that of fluid motions). But it also 
exhibits more persistent (with a lifetime of 30 min) mesoscale magnetic 
flux concentrations (Supplementary Video 1) that play an important 
role in the system evolution and appear to be associated with the bright 
points observed in Ha (ref. 24). We call them ‘mesospots’, by analogy 
with active region sunspots. A key quantity for our purpose is the time 
averaged Poynting flux, whose divergence controls the transfer of 
energy from the magnetic field to the plasma. This flux, which can be 
used as a heating input in mean atmospheric models, is found to be 
consistent (Fig. 1b) with the flux required’® at the base of the chro- 
mosphere (4,500 W m ”) and at the transition region (300 W m ”). 

Besides these mean values, one needs to explain the observed mag- 
netic complexity of the chromosphere and transition region*”””*. In 
our model this complexity is driven by the resistive emergence of 
structures from the subsurface dynamo. The magnetic field that 
emerges consists of interwoven flux tubes anchored to the pho- 
tosphere, with many bifurcations (Fig. 1c) and with typical sizes 
increasing with height (Fig. 1d—g). Its topology is therefore complex, 
with the chromosphere being split into many magnetic cells delimited 
by singular surfaces, the so-called separatrices®. The magnetic struc- 
tures emerge with strong twist and shear, and later on suffer additional 
deformations driven by the surface velocity field. Intense electric 
currents flow across the magnetic lines and along them, with the 
parameter « (measuring the ratio of the parallel electric current to 
the magnetic field strength) being maximal in the chromosphere 
(Extended Data Fig. 2). In particular, as clearly seen in the lowest parts 
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Figure 1 | Magnetic field and 
Poynting flux. a, Intensity of the total 
(black line), vertical component (blue) 
and current-free (green) magnetic 
field. b, Time averaged Poynting flux 
in the photosphere (red shading), 
chromosphere (orange) and corona 
(yellow) computed without (blue) and 
with (dashed blue) the presence of the 
large-scale vertical magnetic field 
(mimicking the supergranulation 
network). ¢, Selected field lines of the 
global magnetic configuration at time 
t = 52.17 min and characteristic 
features of its complex topology: a 
vertical twisted flux tube associated 
with a core of vortices/torsional 
motions (boxed area V), a twisted 
flux rope (TFR), and features that 
characterize the connection of the 
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of Fig. 2a-c and Supplementary Videos 2 and 3, electric currents 
strongly concentrate near the separatrices, which appear as current 
sheets. Because of the presence of all these currents, the magnetic 
energy is far above the energy of the associated current-free magnetic 
configuration (see Fig. la and Extended Data Fig. 3), which has the 
smallest magnetic energy compatible with the flux distribution on the 
photospheric boundary. 


a log,q(llil|?) at x = 6.0 Mm, b 
time = 52.17 min 
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Chromospheric heating results in our model from the occurrence of 
many small eruptions triggered by reconnection. A part of the available 
free magnetic energy is converted by these events into kinetic energy of 
the plasma, which is viscously dissipated into heat (resistive dissipation 
adds only a weak contribution), and a quasi-stationary regime sets 
in where the local (time averaged) rate of magnetic energy dissipation 
is equal to that of energy injection from regions underneath. The 


Figure 2 | Twisted flux rope 
eruption. a-c, Vertical cuts at 
constant x of the logarithm of the 
square of the modulus of the electric 
current density (expressed in non- 
dimensional code units). a, The TFR 
section (boxed) clearly shows that its 
eruption is associated with opening 
of the configuration, with current 
layers structuring the atmosphere at 
those heights and above (see 
Supplementary Video 3). 

d-f, Selected field lines showing the 
E evolution of the TFR seen in Fig. 1c, 
starting from the same time 

(52.17 min). The background image 
represents the vertical component of 
the magnetic field at z = 1,500 km, 
which appears to be organized in 
stronger ‘spots’ and can reach locally 
30 G. d, e, The chromospheric trace 
of the TFR is associated with 
footpoints FP; and FP>. Flux changes 
occur at its chromospheric ‘feet’ FP, 
as seen from d to e. They imply 
topological constraints leading to the 
eruption of the TFR shown at two 
particular times in boxes E, and Ep 
drawn in e and f, respectively. 
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Figure 3 | Torsional motions and Poynting vertical flux. a, b, Horizontal 
slice of the horizontal component of the velocity field at two heights showing 
persistence of vortex/torsional motions with altitude when the vertical 
background field of 5 G is added. c, Horizontal slice at the same heights as in 
b of the vertical component of the velocity field associated with jet-like features 
forming in the regions free of vortices near the separatrices. d, z-component of 
the Poynting vector computed at the same heights as in b. There is clearly a 


characteristic timescale for the disruption of magnetic structures is the 
same as the one associated with their creation—namely, the turbulent 
cell turnover time, of the order of 8 min. Eruptions are driven in 
particular by flux annihilation (see Supplementary Video 1), clearly 
acting on the 8 min timescale, which produces a rapid change in the 
magnetic topology and a decrease in the confinement of the tubes. As 
the plasma pressure is larger on average than the magnetic pressure in 
the chromosphere (while being smaller inside the tubes), magnetically 
driven fluid motions remain confined, and magnetic energy relaxation 
acts as a local source, as does the ‘mechanical flux’ used in standard 
atmospheric models. Moreover, the released energy should also be 
radiated quite locally owing to the weak thermal conductivity. 

Our model also predicts the heating of a thin coronal layer located 
above the region where plasma and magnetic average pressures 
become comparable and the complexity of the magnetic field 
decreases. Coherent structures that are no longer confined generate 
emerging motions by expanding into the region above. In fact these 
structures are already present at 1.5 Mm above the surface, where they 
appear as coherent magnetic flux tubes with a width of about 1-2 Mm 
and a mean intensity of about 20 G (Fig. 1g). The physics is dominated 
by the relatively stable mesospots, at the periphery of which dynamical 
current sheets structuring the plasma are persistently created (as 
clearly seen in Fig. 2a-c and Supplementary Video 3). Eruptions in 
the neighbourhood of the mesospots are triggered by mechanisms (for 
example, flux cancellation) similar to those at work in large-scale 
eruptive events”® (see Fig. 2d-f), and they lead to plasma ejection at 
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correlation between the regions where it takes its largest values and the vortex/ 
torsional motions shown in a and b. A large amount of energy is transported 
towards the upper corona through Alfvén waves. The labelled boxes show 
characteristic magnetic features: twisted flux rope (TER), cusped flux tube 
(CFT) with coronal streamer-like shape, and vertical twisted flux tube (VTFT) 
guiding Alfvén waves. 


Alfvénic speed inside structures that can be associated with the ones 
generically called ‘blinkers’ by observers'®”’ (see Methods). Backward 
motions occur through vortices centred on the mesospots that spiral 
downwards to the surface (areas V in Extended Data Figs 4 and 5), with 
associated horizontal and vertical velocities that can be as large as 
40 km s_' and are then consistent with those observed on the real 
Sun*”’, The energy so released leads eventually to plasma heating up to 
a few Mm above the chromosphere in a way that must be strongly 
dependent on the details of the transport, radiation and dissipation 
mechanisms, unlike in the chromosphere. 

We have thus demonstrated numerically that the heating of the 
chromosphere and ofa thin coronal layer can be consistently explained 
by our surface dynamo driven model. But in our model the bulk of the 
corona cannot be heated by eruptions of chromospheric origin, which 
is in agreement with recently discussed ideas*. To also provide an 
explanation for this heating, the model is now extended by introducing 
a new ingredient: the existence in the quiet-Sun atmosphere of mag- 
netic loops reaching much higher altitudes than the surface dynamo 
generated field*. These loops are created by a deeper dynamo and their 
photospheric footpoints are anchored in the supergranulation net- 
work. Here we take the existence of this additional field into account 
by superposing on our model a weak (5 G) magnetic field that is 
initially vertical and uniform. Owing to the chosen low value, neither 
the small-scale dynamo’?”* nor the chromospheric processes described 
above are significantly perturbed (see Methods). The magnetic struc- 
ture resembles that of the well known Mangrove ecosystem on Earth, 
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with many ‘roots’ diving into the spaces between the granules, and ‘tree 
trunks’ going towards the corona (Extended Data Fig. 6). But a notable 
change appears in the corona. In the vertical magnetic flux tubes above 
each mesospot, Alfvén waves with relatively large lifetime (30-50 min) 
are generated and propagate upwards (see Fig. 3). They carry a positive 
Poynting flux (about 300 W m7 at 10 Mm) that is well correlated with 
the vortices (Fig. 3), and they eventually leave the computational box. 
This flux is larger than the one obtained without background field 
(Fig. 1b), which is a clear consequence of the inhibition of turbulent 
motion dissipation by the presence of coherent flux tubes. The waves 
are associated with perpendicular and parallel motions analogous to 
those recently observed™*. Perpendicular motions are inhomogeneous, 
which favours their dissipation through the creation of small length 
scales. Energy transported by the waves has to be eventually absorbed 
by the coronal plasma through some specific mechanism, which of 
course cannot be described by our MHD model. 

New interesting dynamic features appear, but they do not contribute 
to coronal heating except very near the bottom of the corona. We find 
in particular that short lived thin jets (with thickness 200 km and 
velocity up to 60 km s__’; see Fig. 3) are launched in the current sheets 
surrounding the flux tubes associated with mesospots (see Extended 
Data Fig. 6 and Supplementary Videos 4 and 5 for our calculations, and 
Supplementary Video 6 for observations). They result from eruptions 
triggered between the tubes by mechanisms identical to those 
described above (see Methods). We conjecture that they are associated 
with type II spicules””’, a statement that gains support from the fact 
that the values of the total magnetic field found in our model around 
1.5 Mm (~30 G) are of the order of the averaged observed values, 
which can reach a maximum of 45 G (ref. 30). The global picture 
that emerges from our model, where we have localized Alfven 
waves transported by magnetic structures reaching higher and higher 
altitudes and surrounded by jets, is fully consistent with recent 
observations”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Model. Our approach to the global system composed of the upper part of the 
convection zone (CZ) and the solar atmosphere (SA) rests on the Resistive Layer 
Model! (RLM). CZ and SA are represented by the two adjacent subregions QC” 
and Q*, respectively, and the magnetized plasma contained in each of them is 
described by a specific physical model that is detailed below. The two regions are 
coupled together through a third subregion overlapping them, 0°", where RBL 
stands for Resistive Boundary Layer. We work here in the plane approximation 
(the curvature of Sun’s surface is neglected), with Q°% and Q* being taken to 
be parallelepiped boxes in contact along the plane {z = 0} (using Cartesian coor- 
dinates (x, y, z)). 

It should be noted that similar calculations could also be done, in principle, by 

replacing either of the two plasma models used here by an alternative (for example, 
we could describe CZ by a compressible model”””*? and SA by a model such as 
Bifrost, MURaM*™, COSBOLD*, PENCIL” or Stagger*’). 
Upper convection zone model. A subsurface solar dynamo in a closed box has 
been demonstrated to occur in Boussinesq approximation’’ and more recently 
confirmed also to occur in compressible MHD””?*? using a different numerical 
method as well as different boundary conditions. For simplicity we decided to use 
the more robust magnetized Boussinesq model for describing the physics in the 
upper part O° of the CZ. The incompressible MHD equations are those intro- 
duced in a previous paper”. They are as follows: 


2 


6.v+V(v@v)=—-V: (e+ 7)1-808| +V:(oVv)+oR,0z2 (1) 


B=Vx(vxB)-Vx|(*)v xB (2) 


0,0+V-(Ov) =v-2+A0 (3) 
V-B=0, V-v=0 (4) 


with equation (1) implying the equation and boundary condition for p: 


ap=-VIVv@r)]+¥| v (Sr BoB) oye (5) 
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Onp=—onVxVx ra] —v (F1-808) +a on 0Q% (6) 


These equations are written with appropriate non-dimensionalization and they 
are set in the numerical box Q°” that extends from {z = —1} to {z = 0} and has 
horizontal dimensions 10 X 10. The notations are as follows: B is the magnetic 
field, vthe plasma velocity, p the plasma pressure, 0 the temperature deviation with 
respect to the linear convective profile T(z) = Ty + zAT (where Tp>= T(z = —1) 
and AT = To— T(z = 0)), o the Prandtl number, o,, the magnetic Prandtl number, 
and R, the Rayleigh number. 

We use third order Adams-Bashford time integration that is implicit for the 
dissipation terms present in the momentum (1), magnetic (2), and temperature (3) 
equations. Equations (5) and (6) for the pressure ensure incompressibility of the 
flow, and they are solved at each time step by an efficient incomplete LU factor- 
ization by level preconditioner**. The convective terms are treated by a third order 
non-oscillant upwind scheme. The numerical resolution is 287 X 287 X 25 and the 
spatial discretization uses a staggered mesh in which vand Bare evaluated on the 
faces of the cubic cells, while p and 0 are evaluated at the cell centres. 

For the boundary conditions, which are taken to be periodic in the two hori- 
zontal directions, we impose a horizontal seed magnetic field B, = 1 at the 
boundary {z = —1}, we set vz = —1) = v(z = 0) = 0 and we require v,., to 
have vanishing vertical derivatives at {z = — 1, 0}. Note that some authors” impose 
a vanishing parallel component of the magnetic field on the upper part of the 
boundary instead of the condition v,(z = 0) = 0. This last is preferred here as it 
appears to lead to results in better agreement with the observations. 

Initially, we choose the magnetic field to be horizontal, with a strength decreas- 
ing linearly from B, at z = —1 to zero at z = 0, the velocity to vanish, and give to 0 a 
small fluctuating value in order to trigger the convective instability. We use the 
same regime parameters as previously’, with o = 1, R, = 500,000 and o,, = 5. As 
discussed below, these values enable us to reach a simple regime of low diffusivity 
where the dynamo is saturated. 

Setting of the physical values in the Boussinesq model. This is a key step for 
obtaining quantitative results. The Boussinesq model” is expressed in dimension- 
less units. As the surface density, the surface temperature, and the gravity are 
known, we need to fix two physical units (length and magnetic intensity). We 
found that this can be done accurately by matching the granulation cell size with 


typical data, and the r.m.s. horizontal surface velocity with that furnished by three 
state-of-the-art models®. With a choice of parameters previously used’, this 
gives a length unit of 1.5 Mm—therefore the box Q% has physical dimensions 
15 X 15 X 1.5 in Mm—and a velocity unit of 30 m s_1, which sets the magnetic 
field intensity unit to 5.6 G—then the field imposed at the lower boundary has a 
strength B, = 5.6 G. There is no need for any additional arbitrary input. When the 
dynamo saturates”, a quasi-stationary magnetic field is reached. 

Atmosphere model. To describe the evolution of the plasma in the SA domain 
QS4, we use our code METEOSOL*, which has been exploited extensively for 
computing coronal magnetic configurations, in particular in the context of large- 
scale eruptive phenomena such as Coronal Mass Ejections**“°’. This code solves 
the resistive compressible MHD equations, written in non-dimensional form, 


po:v= — p(v:Vv)+(V x B) x B—Vp+V-(vpVv) + pg (7) 
0,B=V x (vx B)—V x (nV x B) (8) 
6:p +V-(pv) =0 (9) 


a.p+V-(pv) = —("—l)pV-v+H (10) 


(11) 
(12) 


Here p denotes the mass density, T the temperature, v the kinematic viscosity, 17 the 
resistivity, g the gravitational field, and H the heating/cooling term. In the com- 
putations reported in this Letter, however, we introduce a major change with 
respect to the initial version of METEOSOL, which was developed for the numer- 
ical simulation of low beta plasmas. We discard the energy equation (10), and close 
the system by requiring the temperature T in equation (11) to keep a time inde- 
pendent profile To(z) (Extended Data Fig. 1). This may appear as an oversimpli- 
fication if one compares with more advanced models**~’, but those too need to 
make some restrictive assumptions. 

The MHD equations are set in the domain ** that has the same horizontal size 
as Q@ and extends vertically from {z = 0} to {z = 10}, thus representing the part 
of the corona below 15 Mm. The boundary conditions are periodic in the two 
horizontal directions and absorbing in the vertical one. The boundary conditions 
at {z = 0} are specified below in the subsection ‘Optimizing the RLM parameters’. 
The code uses a staggered Cartesian non-uniform mesh. The spatial discretization 
of the operators is defined in such a way that magnetic helicity and topology be well 
conserved in the weakly resistive limit and the constraint V-B=0 be satisfied to 
round-off errors. This latter property is crucial to follow an evolution in which the 
topology evolves rapidly as a consequence of flux changes, as with those arising at 
the photospheric level in the quiet Sun. 

We use the same kind of velocity limiter as previously*’. This results in the 
saturation of large velocities of the order of 100 kms" '. In phenomena suchas jets, 
the velocities may therefore be underestimated by our computations. The resistive 
equation is solved implicitly. Convective terms are solved using a high order 
upwind scheme. 

In the present study, the model is initialized with a density profile po(z) solution 
of the hydrostatic equation. This profile differs slightly from VAL data**. The 
temperature (density) is slightly smaller (larger), and the temperature step near 
the transition region is smoothed, as in an oscillatory averaged transition region. 
This leads to a definition of the various layers: photosphere (width 500 km), 
chromosphere (width 1,500 km), transition region (at 2,000 km above the surface), 
and corona (Extended Data Fig. 1). It is worth noting that the temperature profile 
could have been extended above 1 MK. It is saturated here only to simplify the 
numerical computations. 

Finally, we take v = 0.1 for the viscosity and 7 = 0.1 for the resistivity (in our 
units). The numerical resolution is 287 X 287 X 197. 

Coupling the models of the Sun’s interior and exterior. The coupling between 
the domains O° and Q* is done by introducing at their interface a layer QO?" 
with small thickness 5z*?" and enhanced resistivity. This is the key idea at the basis 
of the RLM”, which was introduced to address the following issue: how to close a 
subphotospheric MHD model at the top of the convection zone domain Q° in 
order to naturally allow the transfer of magnetic energy and helicity into the SA 
through a non-current-free magnetic field. As shown in our previous studies’, 
the key quantity controlling the transfer of magnetic energy and _ helicity 
between two regions is the parallel component E, of the electric field. In the 
RLM, this quantity is continuous during the crossing of the layer. Its expression, 
however, changes from an ideal (or weakly resistive) one near the top of the CZ 
to a very resistive one inside the layer, and again to an ideal (or weakly resistive) 
one at the basis of the SA, where it acts as the driver of an evolution in which 


p=pT 


V:B=0 
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non-current-free magnetic fields are naturally produced. This model has been 
shown to lead to resistive emergence of large-scale twisted flux ropes” producing 
coronal mass ejections. 

Optimizing the RLM parameters. The thickness of the resistive layer needs to be 
comparable to that of the return layer 4™" where the convection flow changes its 
direction from purely vertical to horizontal (the velocity has to match the con- 
dition v, = 0, at z = 0). A®" is much smaller than the length scale Lc of a 
convection cell. We first choose 5z*8! ~ A®". The other important parameter is 
the resistivity. It is imposed to exhibit an enhancement inside the boundary layer 
and to match the resistivity Me of Q% and ne of QS on the lower and upper 
boundaries of the layer, respectively. We chose a resistivity profile of the Gaussian 
form (x,y,z) =No exp _ ee] + Mee @lez+ tes ®1sa, where we have intro- 
duced the characteristic functions for QC” (1ez) and a (15 A) (the characteristic 
function 1p of a domain D takes the value 1 inside D and 0 elsewhere). 

Selection of the parameters is guided by the following general considerations. 
We note that choosing 779 too large would result in killing the magnetic field 
emerging into Q°* while imposing a too small value would prevent field transfer 
through the boundary {z = 0}. This determines a range of possible values for No 
that we further restrict by requiring that a magnetic structure convected in Q% 
(on timescale tc) should not diffuse (on timescale tp) before being transferred 
upwards. This demands tp> t¢ and constrains 19 to lie near its smallest allowed 
value. Eventually, we end up with the following choices: yo = 1, be =0.1, 
ne =0.1, z, = 0, and a, = 0.1. We have checked that going from these values 
for the resistivity to smaller ones results in no magnetic field being transferred into 
OS. This limiting case corresponds exactly to the situation that is considered in 
various dynamo studies in a closed box'*”. 

The boundary conditions at the interface follow. The boundary conditions for 

Q@ are those defined above. For the MHD model in 0“, we obtain on {z = 0}: the 
horizontal component of the electric field, E,(z = 0*)=E(z=0 ),the tangential 
component of the velocity field, v,,,(z = 0*) = v,,y(z = 0”), and the mass density, 
p(z = 0*) = p(z=0 ). The magnetic field evolves consistently by the induction 
equation and does not need to be prescribed. 
Global evolution of the system in Q and initialization of the simulation. 
For each time step At, the RLM evolution of the system in the global domain 
Q = QUO is performed in three substeps: (i) an ideal MHD substep in Q@%, 
(ii) an ideal MHD substep in Q*, and (iii) a purely resistive MHD substep in the 
whole Q where we switch on the 1)-profile defined above. 

We first run the simulation until a nonlinearly saturated regime has been 

reached by the dynamo. When this happens, we replace the field which has been 
obtained in the atmosphere by the unique current-free magnetic field that has the 
same vertical component on {z = 0}, with the plasma being taken to be in hydro- 
static equilibrium. Actually, this particular choice could be replaced by any other 
convenient one, as the system rapidly loses the memory of that condition. After 
about 10 min, the hydrostatic profile settles down to a quasi-stationary state, with a 
new average pressure equilibrium and a magnetic field that is no longer current- 
free, in contrast to the field used in simple models*'**. Our choice for initializing 
the simulation therefore allows us to prove the strong difference between the 
magnetic configuration in our model (Fig. 1c) and the associated current-free 
magnetic configuration (see Extended Data Fig. 3, where field lines are launched 
from the same footpoints as in Fig. 1c). In particular one can observe in the 
actual field sheared and twisted structures (carrying electric currents and magnetic 
helicity, as well as free magnetic energy) that are absent in the current-free 
configuration. 
Properties of the dynamo and of the surface magnetic field. In the nonlinearly 
saturated dynamo regime, the average magnetic energy is about 20% of the average 
kinetic energy and then equipartition does not hold globally. Near the surface, 
however, it holds locally inside the intergranular vortices, where the magnetic and 
kinetic energy densities are found to be similar, up to fluctuations. These prop- 
erties are in agreement with those inferred from observations of the scattering 
polarization in atomic and molecular lines’”. 

At the surface, the mean magnetic field is found to have a vertical component B, 
of intensity 28 G and a total strength of 160 G, in agreement with observed 
values'*-"*. This field appears to be structured at the granular scale and also at a 
mesoscale, with concentrations that have typical values of 500-1,000 G and suffer 
rapid flux coalescence and cancellation (see Supplementary Video 1). Note that the 
existence of the mesoscale structuration is an observed property of the surface 
magnetic field that has been well established by statistical studies and has a direct 
impact on the eruptive coronal structures. 

Two points are worth noting here. (i) More realistic compressible simulations 
also show the local equipartition of the energy’ and the mesoscale organiza- 
tion’ mentioned above, and find similar values (for example, 25 G” and 30 G’*) 
for the vertical component of the magnetic field at the surface. This gives us 
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confidence in the adequacy of our MHD Boussinesq model for describing the 
local surface dynamo, for which it provides the correct values of the lengths, 
velocities, vertical fields and observed spatial structures. Of course, this model 
would be quite questionable if it were applied in the deeper interior of the con- 
vection zone. (ii) When building up a dynamo model, a delicate issue is the choice 
of appropriate boundary conditions. This choice, which is never devoid of some 
arbitrariness, has in general a strong effect on the results. For instance, the vertical 
component B, of the surface field computed in a compressible model” is found to 
change from a ‘lower limit’ of 30 G to a ‘upper limit’ of 84 G when closed boundary 
conditions are replaced by open (for instance, reflecting) conditions on the lower 
part of the boundary. Moreover, a larger value is obtained in the former case for the 
ratio of the tangential component of the field to its vertical component. In our 
incompressible calculations, we use boundary conditions which are close to the 
closed ones and we are at the lower limit: the behaviour of the field in the surface 
layer is not influenced by the processes occurring in the deeper layers of the 
convection zone, which control the solar cycle. Our dynamo thus stays independ- 
ent of that cycle, in agreement with the observations recalled at the beginning of 
the main text that suggest that the small-scale surface fields of the quiet Sun are not 
modulated by the 11-year activity cycle. In the opposite upper limit, on the con- 
trary, the deeper convection zone would get saturated by fields in equipartition 
filling it by reflection, and would exert a strong influence on the surface field. 
Eruption of twisted flux ropes and coronal velocities. During the evolution in 
the low corona, coherent magnetic flux elements can build up (see the vertical 
component of the magnetic field shown on Fig. 1d-g) and become the root of 
twisted flux ropes (TFRs) that are structurally quite similar to the large-scale ones 
extending in the corona and anchored in active regions’s sunspots. An example of 
such a TER is clearly seen inside the box ‘TFR’ of Fig. 1c and its magnification is 
shown in Fig. 2d. These TFRs evolve in response to magnetic flux changes at their 
‘chromospheric feet, which leads eventually to their eruptions at upper altitude, as 
in some models of coronal mass ejections***’. For the TFR of Fig. 2d, for instance, 
one can see in Fig. 2e that its ‘upper’ (right) feet FP2 have been subjected to a strong 
topological change. They have moved from a ‘blue’ polarity to another one that has 
approached the flux rope, and this results in the eruption seen in El and E2 of 
Fig. 2e, f. 

The velocity field. The overall velocity pattern shows striking features that repro- 
duce recent observations***’: at the end of an eruption, a large volume of plasma 
shoots upwards (E2 in Fig. 2a—c). These eruptions occur ubiquitously with differ- 
ent strengths, and reach the base of the corona. The matter then falls downwards, 
with the return flow being organized in vortices concentrated in the quiet flux 
tubes that are always present and reach the low photosphere (see V and CL in 
Fig. 1, and Extended Data Figs 4, 5). These vortices correspond to motions spir- 
alling nearly down to the surface (see CL in Extended Data Fig. 4c). We interpret 
them as the photospheric counterparts to the vortices present in the convection 
zone. These transport cold matter into the intergranular network, and are the 
locations of the strong magnetic field amplification at the source of the quiet- 
Sun magnetic flux tubes. Then both structures actually behave as parts of a unique 
vortex, artificially cut at the surface by our choice of boundary conditions. In 
Supplementary Video 2, the horizontal velocity pattern and the current intensity 
are represented in a horizontal cut at an altitude of 2 Mm. One can see that the 
vortices are separated by current sheets and disappear when an eruption of their 
guiding magnetic flux ropes occurs. They move afterwards on the plane with a 
long lifetime (tens of minutes) that is certainly related to the recycling time at 
mesoscale. 

Magnetic configuration in the vicinity of supergranulation. To take into 
account the large-scale magnetic field associated with supergranulation”', we next 
adda background vertical field Byg = By Z(By = 5 G) occupying the whole domain 
Q (then the total field stays continuous). In the CZ domain, Bo is close to the value 
of the seed field previously used* and its pressure is only ~10~* of the density of 
kinetic energy of the plasma. Then the added field is too weak to have an effect on 
the dynamo and to change the average value of the normal component of the 
surface magnetic field. Of course this result is independent of the particular con- 
dition imposed on the lower boundary. In the atmospheric domain and up to 
chromospheric heights, the resulting magnetic configuration stays about the same 
as the one previously computed, which completely dominates the background 
field. Therefore it exhibits the same multipolar character and topological com- 
plexity, in agreement with the observations of the supergranulation mixed polarity 
regions, and its evolution is still driven by photospheric magnetic cancellation/ 
resistive emergence. Higher up, however, the added background field progres- 
sively starts to organize vertically most of the coronal magnetic configuration. 
This is due to the fact that the field passes through a maximum value of 30 G 
around z = 1,500 km and then decreases. Several features characterize the new 
magnetic topology (Extended Data Fig. 6). (i) Large-scale vertical twisted flux 
tubes (VTFT) are anchored on the surface magnetic concentrations. They have 
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their ‘chromospheric feet’ located above the mesospots, inside the bright points 
observed in Hx, and extend up to the top of the computational box. (ii) TFRs are 
still present. They form in the region between the vertical tubes (Fig. 3a), to which 
they often connect in a continuous way. They reach lower heights than in the case 
where Bo = 0. They eventually merge with the background field in a way remin- 
iscent of coronal helmet streamers above lower closed structures. This is clearly 
visible at some times (for example, t = 72.42 min) in Supplementary Video 5, 
whose contents have to be compared with those of Supplementary Video 6 
obtained by processing observational data furnished by IRIS. (iii) ‘Open’ cusped 
flux tube structures (CFT) develop outside twisting motions. (iv) Finally, magnetic 
arcades (MA) are also present, with some of them located under CFT and exhib- 
iting a multi- 

polar structure. 

Alfvén waves and jets in the vicinity of supergranulation. Not surprisingly, 
most TFRs and MAs still suffer eruptions. Owing to the connection of these 
structures to the nearby vertical tubes, however, the eruptions that are triggered 
differ from those occurring in the absence of the background field by two basic 
effects. First, the perturbations resulting from the reconnection events happening 
near the top of the chromosphere generate Alfven waves that can propagate along 
the vertical tubes and leave the box. It is worth emphasizing that these waves, 
whose progressive dissipation should heat the corona, are produced well above the 
Sun’s surface in our model. This makes a strong difference from most wave coronal 
heating models in which the waves originate from below that surface and have first 
to cross the photosphere and the chromosphere without too much attenuation’. 
Second, thin jets are launched in the sheets forming the interface between TFR and 
vertical tubes. 

The physics of the eruptions is illustrated in Supplementary Video 4, which 
shows the evolution of a vertical cut of the z-component of the velocity field. One 
may observe three dominant features organized around the mesospots and the 
waves that are generated above them in the VIFT and reach the top of the box 
(bulk of the corona): (1) the eruptive regions near the transition region launched 
between VTFT, (2) the jets in the sheets at the boundary with the VTFT, and (3) 
the downflows ending very near the surface and again correlated with vortex 
motions. The jets’ lifetimes are short, 1-3 min, but the other structures move in 
the box with a much longer lifetime, as in the model with no background field. This 
could match the fact that recurrent jets are often observed at the same location, 
which in our calculation is related to the long lived mesospots. When superposing 
the frames of Supplementary Videos 4 and 5 at the same time (for example 72.42 
min), one can clearly see in the areas of ‘red’ downward velocities current sheets 
that are compatible with torsional Alfvén waves, and sheets within the jets and 
above cusps, as at y = 11 Mm. We guess that the jets can be associated with 
observed spicules of type II?*8. 

Code availability. We have decided to not make our code available. Some parts 
of it, which are embedded in a complicated way, are the private property of 


institutions. The reader, however, should be able to reproduce our results by using 
all the information provided in the Methods section. 
Sample size. No statistical methods were used to predetermine sample size. 
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Extended Data Figure 1 | Temperature and density profiles. Shown are profiles of the temperature (blue), which is kept fixed during the simulation, and of the 
initial density profile (black), using the same colour code as in Fig. 1a for the various layers of the atmosphere. 
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Extended Data Figure 2 | Parallel currents profile. Shown is the profile of | confined lower and middle parts of the chromosphere. We use the same 


the average value of the parameter « associated with the electric currents colour code as in Fig. 1b for the various layers of the atmosphere. 
running along field lines. This quantity appears to take high values in the 
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Extended Data Figure 3 | Current-free magnetic configuration. Figure those used in Fig. 1c. Although often used in simple models of the quiet 
shows selected field lines of an idealized magnetic configuration which is Sun, this kind of current-free configuration is very different from the one 
current-free and has the same distribution of vertical component in the produced by our model. In particular, it does not exhibit, either in the 


photosphere {z = 0} as the field shown in Fig. 1c (at time f = 52.17 min). The chromosphere or above, many of the characteristic features (such as shear 
footpoints used to launch the field lines (from various planes) are the same as __and twist) of large magnetic energy and electric current density. 
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Extended Data Figure 4 | Pre-eruptive flow structure. a, Horizontal velocity 
(Vx, Vy) computed at the top of the chromosphere (z = 2 Mm) at the same 
time (f = 52.17 min) as the configuration of Figs 1c and 2d. The flow pattern is 
clearly organized in regions containing vortex/torsional motions and other 
ones that are more ‘calm’. The feet of the TFR of Fig. 2d are located at the 
periphery of those flows. b, Zoom on one vortex/torsional flow structure V, 
just above the one shown on Fig. Ic, g, with the norm of (v,, vy) as 


b 


(Vx,Vy) and !/Vil at Z=2 Mm (km s") 
time=52.17' 


12 


40 
a 30 
€ 
= 
2 20 
g 
>10 

10 

9 ) 
11 12 13 
x-axis (Mm) 


Vz at x=6.0 Mm (km s‘‘) 
time=52.17' 


z-axis (Mm) 


CL 


8 10 12 14 
y-axis (Mm) 


background. c, Horizontal slice of the vertical component v, of the velocity 

at z = 3 Mm, where 1, starts to show red and blue shifts in the vortex/torsional 
feature V, while v, is mostly negative below this height. Other boxed 

features, located at the same places as in a, are shown for information.d, Vertical 
slice of vz in the plane x = 6 Mm. Downward motions associated with CL 
almost reach the photosphere. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
Velocity (vx,vy) at z=2 Mm Vz at z=3.0 Mm (km s“) 
time = 63.50' Eo time = 63.50' 
14 Vz at x=6.0 Mm (km s") Eo 
time=63.50' 
V 
12 12 
V + 40 
~ _ 10 10 
iS € 
= = 8 a @ + 20 
n ” € 
2 g = 
G ® 6 a 6 0 
> > fF x 
Oo 
4 Nn 4 
-20 
2 2 L a) 
e 
0 i . j Po = 
0 2 4 6 10 12 14 8 10 12 14 
x-axis (Mm) CL x-axis (Mm) y-axis (Mm) 


Extended Data Figure 5 | Eruptive flow structure. a, Horizontal velocity 
computed at the top of the chromosphere (z = 2 Mm) at the same time 

(t = 63.50 min, that is, during the eruption phase) as the magnetic 
configuration of Fig. 2f. The eruptive area (E,) becomes free of vortex/torsional 


motions, which are still present elsewhere, as in the vortex/torsional feature (V) 
and concentrated legs (CL). Large vertical flows associated with the eruption 
of the TFR are correlated with a positive vertical component v, shown as a 
horizontal slice in b and as a vertical slice in c. 
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Extended Data Figure 6 | Magnetic configuration with the addition of the 
background magnetic field. Shown are selected field lines of the magnetic 
configuration at time t = 72.42 min (as in Fig. 3) corresponding to the case 
where a background vertical magnetic field of 5 G is added to mimic the 
effects of the supergranulation network. Several characteristic features are 
represented above an isosurface of the temperature fluctuations coloured with 


the vertical component of the magnetic field. They confirm and extend those 
of Fig. 1c: twisted flux rope (TFR), cusped flux tube (CFT) with coronal 
streamer-like shape, vertical twisted flux tube (VTFT) guiding Alfvén waves, 
and magnetic arcades (MA). The structures coloured in grey and green, 
respectively, are VTFT. 
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Visible-frequency hyperbolic metasurface 


Alexander A. High! *, Robert C. Devlin**, Alan Dibos*, Mark Polking', Dominik S. Wild?, Janos Perczel**, 


Nathalie P. de Leon”, Mikhail D. Lukin? & Hongkun Park’? 


Metamaterials are artificial optical media composed of sub- 
wavelength metallic and dielectric building blocks that feature 
optical phenomena not present in naturally occurring materials’. 
Although they can serve as the basis for unique optical devices that 
mould the flow of light in unconventional ways, three-dimensional 
metamaterials suffer from extreme propagation losses**. Two- 
dimensional metamaterials (metasurfaces) such as hyperbolic 
metasurfaces for propagating surface plasmon polaritons’°"' have 
the potential to alleviate this problem. Because the surface plasmon 
polaritons are guided at a metal-dielectric interface (rather than 
passing through metallic components), these hyperbolic metasur- 
faces have been predicted to suffer much lower propagation loss 
while still exhibiting optical phenomena akin to those in three- 
dimensional metamaterials. Moreover, because of their planar 
nature, these devices enable the construction of integrated meta- 
material circuits as well as easy coupling with other optoelectronic 
elements. Here we report the experimental realization of a visible- 
frequency hyperbolic metasurface using single-crystal silver nano- 
structures defined by lithographic and etching techniques. 
The resulting devices display the characteristic properties of 
metamaterials, such as negative refraction’ and diffraction-free 
propagation®’, with device performance greatly exceeding those 
of previous demonstrations. Moreover, hyperbolic metasurfaces 
exhibit strong, dispersion-dependent spin-orbit coupling, enabling 
polarization- and wavelength-dependent routeing of surface plas- 
mon polaritons and two-dimensional chiral optical components’*”’. 
These results open the door to realizing integrated optical meta- 
circuits, with wide-ranging applications in areas from imaging and 
sensing to quantum optics and quantum information science. 

Our approach for realizing a visible-frequency hyperbolic metasur- 
face (HMS) involves the definition of a nanometre-scale silver/air 
grating on a sputter-deposited, single-crystalline silver film by elec- 
tron-beam lithography and plasma etching. Unlike focused-ion-beam 
milling methods*”’, which produce rough, defect-ridden structures, 
this new method produces smooth, high-quality silver nanostructures 
with high aspect ratios, critical for the realization of a surface plas- 
mon polariton (SPP)-HMS. Figure 1 illustrates the materials and 
structures that form the basis for HMSs. As a starting material, we 
sputter-deposited a micrometre-thick silver film on a (111)-silicon 
substrate at 300°C and at high deposition rate (>1.5 nm gs )1617, 
High-resolution transmission electron microscopy (Fig. la), X-ray 
diffraction, atomic force microscopy, and electron backscatter 
diffraction measurements (see Supplementary Figs 1, 2 and 3, 
respectively) reveal that these films are single-crystalline and have 
root-mean-square roughnesses as low as 300 pm. The ellipsometric 
characterization (Supplementary Fig. 4) shows that, over a large 
portion of the visible spectrum, the optical loss in our film is much 
lower than those reported previously'*'? and is comparable to 
recently reported silver films prepared by molecular beam epitaxy”. 
Unlike the molecular beam epitaxy process”, however, the sputtering 
process can rapidly grow single-crystalline films of large thicknesses, 


which is crucial for the realization of an HMS because it prevents SPP 
absorption by the silicon substrate. From the experimentally mea- 
sured dielectric constants, we determine that the SPP propagation 
length L, (defined as the length over which the SPP intensity decays 
by 1/e) in a silver film exceeds 100 jim for far-field wavelengths 
A greater than 650 nm (Fig. 1b). 

Beyond exceptional optical performance, single-crystalline silver 
films offer mechanical and thermodynamic stability, which is crucial 
for defining nanoscale features using lithography and etching. We 
fabricated the silver/air gratings and light in- and out-coupling struc- 
tures (used to convert far-field light into SPPs and vice versa) by first 
defining an Al,O; hard mask with electron-beam lithography and then 
dry-etching silver with argon plasma (see Supplementary Fig. 6 for 
details). After etching, the residual Al,O; mask was removed with 
hydrofluoric acid, leaving clean, high-aspect-ratio silver features. 
Figure 1d and e show scanning electron microscope (SEM) images 
of representative devices (schematically shown in Fig. 1c with silver 
ridge height h = 80 nm, width w = 90 nm, and pitch a = 150 nm). The 
smooth surface of our devices, coupled with the single-crystalline nat- 
ure of silver, minimizes extrinsic optical losses originating from grain 
boundaries and surface roughness. 

A recent theoretical work"’ has predicted that a silver/air grating 
with appropriate sub-wavelength feature sizes, such as that shown in 
Fig. 1d and e, should exhibit hyperbolic dispersion for propagating 
SPPs below a critical wavelength, 2 = Ay (and elliptical dispersion 
above it). A simple physical picture provides insight into the transition 
from hyperbolic to elliptical dispersion in this structure. At short 
wavelengths, the plasmonic modes are tightly confined to the ridges 
of the grating, qualitatively similar to the situation in an array of 
parallel nanowires that exhibits hyperbolic dispersion’’. In the long 
wavelength limit, on the other hand, the modes are only weakly con- 
fined, and the grating can be considered a perturbation to a flat surface, 
resulting in elliptical dispersion (for discussion, see Supplementary 
Figs 7 and 8). 

To experimentally verify these predictions, we fabricated a series of 
devices and tested their optical properties. Figure 2 presents a device, 
D1, designed to demonstrate negative refraction of SPPs, a known 
property of hyperbolic metamaterials”. This device consists of a 
silver/air grating as well as a groove that launches SPPs on flat silver 
upon far-field excitation (Fig. 2a). The angle of refraction at the flat 
silver/grating interface was determined by collecting scattered light at 
the in-coupling structure, the silver-film/grating interface, and the 
corresponding out-coupling structure. As is clearly shown in Fig. 2b 
and d (see also Supplementary Figs 9 and 10), the behaviour of D1 
changes from normal refraction at 1 > 540 nm to negative refraction 
at 2 < 540 nm. We note that A;, at which the device behaviour 
changes from normal to negative refraction (that is, elliptical to hyper- 
bolic dispersion), can be tuned by varying the device geometry 
(Supplementary Fig. 11a) or by changing the dielectric environment 
of silver (for example, by depositing a thin layer of Al,O3 on the device; 
see Supplementary Fig. 11b). 
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Figure 1 | Single-crystalline silver film and fabricated devices. a, High- 
resolution transmission electron microscopy image taken down the [211] axis 
that demonstrates the single-crystalline nature of the sputter-deposited silver. 
The growth direction is from left to right along the [111] direction with the 
surface of the silver film shown at the far right of the image. The inset shows the 
electron diffraction pattern. b, SPP propagation length L, on a silver film 
derived from measured real and imaginary parts of the dielectric constants for 


Figure 3 illustrates a second device, D2, with Ay = 560 nm, which 
exhibits another remarkable phenomenon: diffraction-free propaga- 
tion. At A7, the flat dispersion curve implies that all SPPs propagate 
parallel to the silver ridges’! (Fig. 4b, Supplementary Fig. 12 and dis- 
cussions below) and thus SPPs excited on a single silver ridge remain 
primarily confined to the same ridge despite its sub-wavelength width 
(~42/6). In device D2, slots defined directly on individual silver ridges 
serve as in-coupling structures that convert far-field light to SPPs (or 
out-coupling structures that convert SPPs to far-field light); see 
Fig. 3a-c. As shown in Fig. 3d, at 4 = 585 nm, despite being 
excited by only one in-coupling structure, SPPs scatter off multiple 
out-coupling structures owing to normal, diffractive propagation. 
However, at J; = 560 nm SPPs primarily scatter off the out-coupling 


100-nm-thick (blue circles) and 1,200-nm-thick (pink circles) sputtered films. 
For comparison, propagation lengths calculated using the dielectric 

constants reported in refs 18-20 (green line) are also shown (error bars are one 
standard deviation from the average value). c, Schematic of HMS, with pitch 
a, width w and height h. d, A cross-sectional SEM image of a fabricated 
device. e, Top-down SEM image of a silver/air grating and out-coupling 
structures (top). 


structure located on the same ridge as the in-coupling structure 
(Fig. 3e), signifying diffraction-free propagation. This diffraction- 
free propagation, coupled with suitably designed ‘magnifying’ out- 
coupling structures, enables sub-diffraction-resolution imaging 
and photon routeing®’. Figure 3g presents one such demonstration: 
at Ay = 560 nm, despite SPPs being launched at two in-coupling 
structures separated by a sub-wavelength spacing of 150 nm, SPPs 
primarily scatter off the two corresponding out-coupling structures 
staggered along the propagation axis. 

We next demonstrate a new optical phenomenon supported by the 
silver/air grating: the dispersion-dependent plasmonic spin-Hall effect 
(PSHE)'**?*?4, a plasmonic analogue of the electronic Rashba*° 
and photonic spin-Hall effects'®’°. In electronic systems, spin-orbit 
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Figure 2 | Measurement of SPP refraction at a 
flat silver/HMS interface. a, An SEM image of a 
device, D1. Far-field light is converted to SPPs in the 
silver film via the angled rectangular in-coupling 
structure at the top of the image. The excitation laser 
is unpolarized. The SPPs propagate along the silver 
film, are refracted at the film—metasurface interface, 
and are then scattered into the far field at the out- 
coupling structure at the bottom of the image. In 
this device, the height of the silver ridge is 80 nm, the 
width is 90 nm, and the pitch is 150 nm. b, The 
angle of refraction 0... as a function of wavelength, 
with elliptical (red-shaded) and hyperbolic (blue- 
shaded) dispersion regimes indicated. The solid line 
is the simulated angle of refraction from FDTD 
simulations. The input and output angles, 0;,, and 
Osup are defined in the inset, with the dotted box 
indicating the region of the HMS. c, The 
experimentally measured (blue data points) and 
simulated (black line) propagation length L, of 
SPPs in the HMS (error bars are one standard 
deviation from the average value). d, Images of SPP 
refraction at the flat silver/HMS interface. The 
dashed boxes indicate the region of the HMS. 
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Figure 3 | Observation of diffraction-free SPP 

propagation. a, A schematic for SPP diffraction in 
a silver/air grating. b, A schematic for diffraction- 
free SPP propagation ina silver/air grating. Ina and 
b an in-coupling structure defined on a single ridge 


(left-hand side) acts as a point source of SPPs. SPPs 
scatter off at the out-coupling structures on the 
other side that are staggered along the y axis for 
magnification. c, An SEM image of a device, D2. 
The height of the silver ridge is 80 nm, the width is 
90 nm, and the pitch is 150 nm. The y-axis distance 
between out-coupling structures is 1 pm, thus 
providing the magnification of 6.7. d, e, Optical 
images of SPPs scattering at the out-coupling slots 
(right-hand side) with SPPs excited with 
unpolarized light by a single in-coupling structure 


Normalized intensity 


coupling arises due to the lack of inversion symmetry (for example, in 
the presence of an external electric field) and couples the spin of a 
charge carrier to its propagation direction. Similarly, in an optical 
system with tightly confined optical modes, the light propagation dir- 
ection couples to the electric field rotation, leading to the photonic 
spin-Hall effect’®**. In the HMS, a PSHE arises from three structural 
features of the silver/air grating. First, this structure lacks inversion 
symmetry and exhibits high optical anisotropy for SPPs propagating 
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(inset to Fig. 3f) for 2 = 585 nm (d) and 2 = Ay = 
560 nm (e). f, Diagonal cross-sections of optical 
image across out-coupling slots at 1 = 560 nm 
(blue) and 2 = 585 nm (black), with the input 
position marked in red. The x-axis coordinate of 
the cross-section is indicated on the bottom axis. 
g, Similar cross-sections obtained when two in- 
coupling notches (inset) are used. In f and g, the 
out-coupling intensities are normalized such that 
the total integrated intensities are equal. 


parallel and perpendicular to the silver ridges (y- and x-axes, respect- 
ively, in Fig. 4a). Second, due to its nanoscale ridged structure, the 
HMS can support electric field components perpendicular to the SPP 
propagation direction (for example, E, and E, for propagation along y), 
and thus SPPs can exhibit circular polarization. Third, the dispersion 
of the system is strongly frequency dependent, such that the direction 
of the SPP group velocity changes when moving from hyperbolic to 
elliptical polarization. Combined together, these three features enable 


Figure 4 | The dispersion-dependent plasmonic 
spin-Hall effect (PSHE). a, A schematic of the 
PSHE. In a silver/air grating, SPPs with different 
helicities propagate into distinct spatial directions. 
b, Illustration of the origin of the PSHE. Positive k,. 
corresponds toa (blue shading) and negative k, 
to o* (magenta shading). The direction of the 
group velocity (black arrows) is perpendicular to 
the isofrequency contours at a given dispersion 
regime: elliptic (red), diffractionless (green), and 
hyperbolic (cyan). The allowed k, values extend 
from -1/a to +1/a, where a is the pitch of the 
silver/air grating. The dotted black line indicates 
the free-space isofrequency contour at the same 
frequency as the red curve. c, An SEM image of a 
device, D3, used to examine PSHE with an in- 
coupling structure (cyan rectangle), a silver/air 
grating (pink rectangle), and out-coupling 
cylinders (green rectangle). d, Image of D3 under 
unpolarized laser excitation of the in-coupling 
structure. The out-coupling region is marked by 
the yellow box. e, f, Image from the out-coupling 
structures as a function of wavelength collecting 
only a (e)anda (f) polarized light. From bottom 
to top, the wavelength increases from 480 nm to 
700 nm in increments of 20 nm. g, h, Light 
intensities measured at the out-coupling structures 
for a*- (magenta) anda - (blue) polarized light at 
A = 530 nm (g) and 2 = 640 nm (h). 
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the dispersion-dependent PSHE in which the SPP propagation dir- 
ection and helicity are linked to each other in a frequency-dependent 
manner (Fig. 4). While the PSHE has been experimentally observed in 
bulk hyperbolic metamaterials at radio frequencies", it has not prev- 
iously been realized experimentally in the visible frequency regime. 

Figure 4c shows a device, D3, which demonstrates the PSHE. In this 
device, a single circular in-coupling structure near the silver/air grating 
scatters unpolarized far-field light to propagating SPP modes, and an 
array of vertical silver cylinders on the other side (diameter 140 nm, 
height 90 nm) comprises an out-coupling structure that scatters SPPs 
back to the far field. These out-coupling structures convert the SPP 
polarization into far-field polarization with minimal distortion (see 
discussions in Supplementary Fig. 13). As shown in Fig. 4d, when 
SPPs are excited by the in-coupling structure, they split into two sepa- 
rate (left- and rightward) beams. Polarization-resolved imaging reveals 
that these split SPP beams exhibit opposite circular polarizations (one 
predominantly (~80%) o° and the other is 0 with o~ = E, + iE 
Fig. 4g and h). Moreover, this helicity-dependent splitting is strongly 
wavelength-dependent (Fig. 4e and f). When the silver/air grating 
exhibits an elliptical dispersion (2 > 580 nm), the ot (a )-polarized 
SPPs deflect to the left (right), with the deflection angle decreasing with 
decreasing 2. When the silver/air grating enters the hyperbolic regime 
(A < 580 nm), this behaviour reverses (similar to the magnetic 
field switching sign in the electronic Rashba effect), and the o* 
(a )-polarized SPPs deflect to the right (left). At the diffraction-free 
propagation point (A; = 580 nm), the splitting between o* and a 
helicities vanishes. 

The physical origin of this dispersion-dependent PSHE can be 
understood using the isofrequency contour schematically illustrated 
in Fig. 4b. In the silver/air grating, the SPP helicities are determined by 
the wave vector that characterizes the electric field. The electric field of 
an SPP mode is given by E~E,? —(k,/kz)E-2, where kz =ix/k? — ko, 
k,= k2 + ky is the plasmonic wavenumber in the direction of 7, 7 is 
the in-plane direction of k, and ky is the free-space wave number given 
by w/c, where c is the speed of light in a vacuum"”’. Because the SPPs 
are evanescent perpendicular to the surface, k, is imaginary and the 
field rotates along 7. Near the centre of the Brillouin zone (k, = 0), we 
have k, ~ k, ~ ko, and the SPP wavelength /spp in our structures is 
similar to the far-field wavelength 2 (Supplementary Fig. 14 and see 
discussion below). As k, increases, however, k, continuously increases 
due to highly anisotropic dispersion relation, and for k, >> kp the 
modes become circularly polarized in the x-z plane. Near the edge 
of the Brillouin zone, modes of opposite circular polarization hybrid- 
ize, and the polarization effects in the x-z plane vanish. These con- 
siderations indicate that when projected on the x-z plane, an SPP 
mode with k, < 0 (k, > 0) exhibits 0° (a) polarization (note that 
an SPP with k, = 0 will exhibit electric field rotation in the y-z plane’’). 
The direction of SPP propagation is, however, governed by the group 
velocity vector that is perpendicular to the isofrequency contour. The 
small angular spread of the right- and leftward SPP beams is a con- 
sequence of the shape of this isofrequency contour, which reflects 
the optical anisotropy of our silver/air grating. As shown in Fig. 4b, 
this simple consideration explains our experimental observations. 
Specifically, when the device exhibits elliptical dispersion (A > Ay), 
the o~ (o )-polarized SPPs deflect to the left (right) because the 
leftward (rightward) group velocity vector is associated with k, < 0 
(k, > 0). In contrast, in the hyperbolic dispersion regime (A < Ay), the 
reverse is true, leading to the switching of the deflection directions for 
ot (a )-polarized SPPs. 

A key improvement of our approach is the dramatic reduction in 
optical losses in comparison to bulk metamaterials. To directly char- 
acterize the optical performance of HMS, we fabricated silver/air grat- 
ings of varying lengths with identical light in- and out-coupling 
structures, and measured the out-coupled light intensity as a function 
of the grating length at the same in-coupling intensity. By fitting the 
intensity-length curve by a single exponential at a given 4, we then 
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determined L, in our devices, ranging from 6 jum at 4 = 490 nm up to 
29 um at 2 = 610 nm (blue stars in Fig. 2c; for discussion see 
Supplementary Fig. 15). These propagation length L, values are one 
to two orders of magnitude larger than those reported in bulk visible- 
frequency hyperbolic metamaterials**. 

To analyse our observations quantitatively, we carried out detailed 
finite-difference time-domain (FDTD) simulations of SPP propaga- 
tion in our device geometries (D1 through D3). These FDTD simula- 
tions are in good agreement with the experimental observations in 
Figs 2-4: negative refraction for 1 < Ay (Fig. 2b, solid line), diffrac- 
tion-free propagation at 1; (Supplementary Fig. 12), and the disper- 
sion-dependent PSHE (Supplementary Fig. 16). The FDTD 
simulations also indicate that the local polarization of the SPP field 
reaches 87% for left- and right-propagating circularly polarized 
SPPs, larger than experimentally observed values of 80%, due to the 
imperfect polarization conversion of our out-coupling structure 
(Supplementary Fig. 17). The simulated propagation lengths (black 
line in Fig. 2c; for discussion see Supplementary Information) in the 
HMS are ~30% larger than the experimentally determined ones, prob- 
ably owing to residual nanoscale roughness introduced during the 
fabrication procedure. Owing to increasing SPP confinement, the 
sensitivity to surface roughness increases at lower wavelengths, result- 
ing in greater scattering losses. Despite these imperfections, the 
measured propagation distances indicate that the low-loss, two- 
dimensional nature of our devices offers a substantial improvement 
over conventional bulk metamaterials in terms of optical loss, thereby 
opening the door for a wide array of high-performance plasmonic 
nanostructures. 

Although our demonstrations focused on one particular family of 
metasurfaces, that is, a silver/air grating, the fabrication strategy is 
general and compatible with other bottom-up and top-down semi- 
conductor and metal processing techniques. Our method thus opens 
up the possibility of realizing low-dimensional transformation optics 
and metamaterial-based devices*”’. The same method can be used to 
generate integrated metamaterial circuits that combine HMSs on-chip 
with other optoelectronic and plasmonic devices. The HMSs can 
enable quantum optics applications as well. Because of their small mode 
volumes and increased plasmonic density of states'®'', HMSs can be 
used for enhancing interactions of SPPs with individual quantum emit- 
ters—a new pathway for realizing solid-state quantum nonlinear optical 
circuits. Moreover, the frequency-dependent spin-orbit interaction 
enables the exploration of a new class of solid-state quantum optical 
phenomena that involve chiral optical interfaces with quantum emit- 
ters. By extending recent demonstrations involving such interactions 
with one-dimensional waveguides'* into two dimensions, this could 
enable spin-dependent routeing of single photons as well as non-trivial 
topological phenomena that combine spin-orbit interactions with 
single photon nonlinearities associated with quantum emitters”®. 
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Bipolar seesaw control on last interglacial sea level 


G. Marino!, E. J. Rohling’”, L. Rodriguez-Sanz', K. M. Grant’, D. Heslop’, A. P. Roberts!, J. D. Stanford? & J. Yul 


Our current understanding of ocean—atmosphere-cryosphere inter- 
actions at ice-age terminations relies largely on assessments of the 
most recent (last) glacial-interglacial transition’ *, Termination I 
(T-I). But the extent to which T-I is representative of previous termi- 
nations remains unclear. Testing the consistency of termination 
processes requires comparison of time series of critical climate para- 
meters with detailed absolute and relative age control. However, such 
age control has been lacking for even the penultimate glacial termina- 
tion (T-II), which culminated in a sea-level highstand during the last 
interglacial period that was several metres above present*. Here we 
show that Heinrich Stadial 11 (HS11), a prominent North Atlantic 
cold episode**, occurred between 135 + 1 and 130 + 2 thousand years 
ago and was linked with rapid sea-level rise during T-II. Our conclu- 
sions are based on new and existing®’ data for T-II and the last 
interglacial that we collate onto a single, radiometrically constrained 
chronology. The HS11 cold episode** punctuated T-II and coincided 
directly with a major deglacial meltwater pulse, which predominantly 
entered the North Atlantic Ocean and accounted for about 70 per cent 
of the glacial-interglacial sea-level rise*’. We conclude that, possibly 
in response to stronger insolation and CO, forcing earlier in T-II, 
the relationship between climate and ice-volume changes differed 
fundamentally from that of T-I. In T-I, the major sea-level rise 
clearly post-dates*’®"' Heinrich Stadial 1. We also find that HS11 
coincided with sustained Antarctic warming, probably through a 
bipolar seesaw temperature response’, and propose that this heat 
gain at high southern latitudes promoted Antarctic ice-sheet melt- 
ing that fuelled the last interglacial sea-level peak. 
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Late Quaternary glacial-interglacial cycles exemplify the response 
of Earth’s climate to combined insolation and CO; forcing*. Their 
overall ‘sawtooth’ structure consists of gradual global cooling and 
ice-volume build-up during glaciations, followed by rapid warming 
and melting during glacial terminations**”’. Millennial-scale episodes 
of North Atlantic cooling and ice-rafted-debris (IRD) deposition, 
commonly referred to as Heinrich events'*, punctuated each of the 
last five terminations’*"*. During T-I, North Atlantic cooling and IRD 
events occurred during Heinrich Stadial 1 (HS1, ~18-14.6 kyr ago, ka) 
and the Younger Dryas (YD, ~12.8-11.5 ka), which coincided with 
weakened Atlantic Meridional Overturning Circulation (AMOC), 
drought in the boreal tropics, and heat build-up and strengthening 
of wind-driven upwelling in the Southern Ocean’. These develop- 
ments were coupled with a bipolar temperature ‘seesaw’ and CO, 
release from the (Southern) ocean into the atmosphere’. Neither 
HS1 nor the YD appear to correspond to periods with high rates of 
deglacial sea-level rise’ (so-called meltwater pulses, MWPs). This 
raises the question of whether the North Atlantic climate and sea-level 
change were decoupled only during T-I, or if this decoupling applied 
also to other terminations with different forcing histories. Accordingly, 
we need detailed assessments of the relative phasing between bipolar 
seesaw, atmospheric CO, and sea level through older terminations, 
while the absolute timing of these processes allows the resolution of 
their relationship with insolation forcing. 

Absolute chronologies for terminations older than T-I hinge on 
radiometric (U-series) dating of speleothems and corals, while marine 
sediment cores lack such independent age control. Pioneering studies 
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Figure 1 | Location of the palaeoclimate archives discussed here, and the 
Mediterranean circulation pattern. a, Bathymetric map of the Mediterranean 
Sea with locations of the marine and continental records discussed in the text. 
b, West-east cross-section across the Mediterranean Sea with a sketch of 

the basin’s anti-estuarine circulation: surface inflow (black dashed arrows) 
through the Strait of Gibraltar and subsurface circulation and outflow 
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(white arrows) into the Atlantic Ocean (Methods). Yellow stars indicate the 
upper intermediate-water depth habitat of N. pachyderma (d) (Methods) at 
the location of core LC21 (35° 40’ N, 26° 35’ E, water depth 1,522 m) and 
Ocean Drilling Program (ODP) Site 975 (38° 53.8’ N, 4° 30.6’ E, water depth 
2,415 m). SW, surface water (blue); IW, intermediate-water (red); DW, 

deep water (grey). 
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placed pre-T-I terminations on radiometrically constrained timescales 
using: (i) correlation between North Atlantic deep-sea benthic stable 
oxygen isotopes (5'8Openthic) and coral sea-level benchmarks”, 
(ii) inferred similarity between North Atlantic sea surface temperature 
(SST) and an Italian speleothem 5'%O (8 “O pdicsthiwa) record inter- 
preted in terms of precipitation change’, and (iii) inferred synchroni- 
city of North Atlantic cold events and Asian Monsoon weak episodes, 
archived in BO ielestiien time series’*'®. However, deep-sea hydro- 
graphic changes complicate interpretation of North Atlantic 
5'*Osenthic records in terms of sea-level change’, and factors other 
than precipitation may affect BD Oesciecahiain in the Mediterranean 
region®. Finally, the relationship between North Atlantic climate and 
Asian Monsoon intensity may not be straightforward on millennial 
timescales”. Hence, these studies laid critical foundations, but did not 
deliver a conclusive, quantitative (with uncertainties) chronological 
framework for pre-T-I terminations. Among these, T-II is noteworthy 
because its overall insolation forcing considerably exceeded that of T-I 
(Extended Data Fig. 1), and it culminated in an interglacial that was 
warmer than the Holocene*”’, with sea level above present’. 

To advance the debate, we present a new radiometrically constrained 
chronology for North Atlantic records of climate variability across T-II 
and the last interglacial period. We use records that are co-registered in 
single sediment-sample series (that is, phase relationships are unam- 
biguous), and well-understood land-sea climate relationships to trans- 
fer—with rigorous uncertainty propagation—radiometric (U-series) 
ages to marine records. We exploit the well-documented intermedi- 
ate-water connectivity between the eastern and western Mediterranean 
Sea (Methods), and the relationship between marine surface water 
microfossil 5'°O and U-series-dated regional O Onpsicaiinnn records”* 
(Extended Data Fig. 2). We generated centennially resolved 5'°O 
and 8'°C records for Mediterranean surface- and intermediate-water 
dwelling planktic foraminifera Globigerina bulloides and Neoglobo- 
quadrina pachyderma (dextral) (Methods), respectively, from western 
Mediterranean Ocean Drilling Program (ODP) Site 975 (ODP975, 
Fig. 1). Next, we synchronized the ODP975 8°.On pachyderma (a) record 
to the 580, pachyderma (a) tecord of eastern Mediterranean core LC21 
(Fig. 2a; Methods). The latter was placed previously* on a radiometric 
timescale by relating its co-registered surface 8'°O signal to Soreq 
Cave O Oicctneiieiy Finally, we use the co-registered 31°06. puttoides Of 
ODP975 to further transfer the radiometrically constrained timescale 
to the 8°06 putioides record of nearby ODP Sites 976 (ODP976) and 
977, and to core MD01-2444 and, in turn, to the SST and/or IRD 
records of North Atlantic climate variability that are archived in these 
sediment cores (Extended Data Figs 2, 3, Methods). 

Comparison of 51806. pultoides ftom ODP975 and ODP976 on their 
new radiometrically constrained timescale with the Corchia Cave 
O Orsciasthen record on its own radiometric chronology (Methods) 
reveals a striking similarity with respect to both timing and magnitude 
of 5'*O shifts, notably between ~140 and ~129 ka (Fig. 2b). Possible 
correlation between western Mediterranean and Corchia Cave 5'°O 
signals was contemplated before’, but—at that stage—an alternative 
coupling of North Atlantic warming with increased precipitation over 
central Italy was favoured. The two interpretations lead to substantially 
(up to ~4 kyr) different timings for HS11. The approach used here is 
independent of assumptions about the phasing between North 
Atlantic climate and 8 Osssientasins Instead, it diagnostically indicates 
a source-water control on Corchia Cave ge © ee negative 580 
anomalies in North Atlantic and western Mediterranean surface 
waters—during Heinrich events’* (see below)—were transmitted via 
the hydrological cycle to the cave catchment. The 5'*O similarity 
between ODP975 and Corchia Cave provides strong and independent 
validation of the LC21-based radiometric chronology for ODP975, 
and thus to the LC21-Soreq Cave synchronization used previously 
to provide radiometric age constraints to sea-level records®. 
Validation of the LC21-Soreq Cave synchronization was obtained 
using U-series-dated Asian monsoon records’, so the T-II chronology 
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Figure 2 | Construction and validation of a radiometrically constrained 
chronology for western Mediterranean Sea ODP975 and ODP976. 

a, Synchronization of N. pachyderma (d) 5'°O from ODP975 (red line) to its 
counterpart from eastern Mediterranean core LC21 (blue line; Methods), 
which was placed on a radiometric chronology by ref. 8. Red filled circles and 
grey error bars depict the tie points used to synchronize ODP975 to LC21 
and the 2¢ synchronization uncertainties, respectively (Methods, Extended 
Data Table 1). b, Synchronization of G. bulloides 5'*O from ODP976 (grey line; 
after ref. 6) to its counterpart from nearby ODP975 (red line) on its radio- 
metrically constrained chronology. Grey filled circles and error bars depict 
the tie points used to synchronize ODP976 to ODP975 and the 26 synchroni- 
zation uncertainties, respectively (Methods, Extended Data Table 2). Compar- 
ison of 8!°O. pulloides from ODP975 and ODP976 with the speleothem 

580 (5'°Ospeteothem) from Corchia Cave’ (Italy) is made to validate (not to 
constrain) the chronology of the western Mediterranean cores. Confidence 
limits (95% lighter sky blue, 68% darker sky blue) for the Corchia Cave 
5'8O.peteothem data (sky blue circles) result from Monte Carlo analysis 
(Methods) of the chronological and 3°O speleothem uncertainties (ref. 7). 

c, Comparison of the western Mediterranean ODP976 sea surface temperature 
(SST) data® (grey circles) on the radiometrically constrained chronology 
developed in this study with the Antarctic atmospheric methane (CH) record” 
(orange circles) on the AICC2012 timescale”, supporting consistency between 
the two independent chronologies. Confidence limits for the ODP976 SST 
(95% light grey, probability maximum with 95% uncertainty, heavier grey line 
and envelope) and atmospheric CH, concentrations (95% light orange, 
probability maximum with 95% uncertainty, heavier orange line and envelope) 
result from Monte Carlo analysis of their chronological and proxy related 
uncertainties, employing Gaussian filters of 0.2 and 0.4 kyr, respectively 
(Methods). Black bar indicates the timing of Heinrich Stadial 11 (HS11). 


of sea-level change now involves agreement between three independ- 
ent, radiometrically constrained approaches. 

Investigation of the interhemispheric climate relationships requires 
first an assessment of consistency between our new radiometrically 
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constrained chronology and the AICC2012 timescale for the Antarctic 
EPICA Dome C (EDC) ice core’. Building on the notion that atmo- 
spheric methane (CH,4) concentrations and Greenland/North Atlantic 
temperatures covaried on millennial timescales’, we compare the 
western Mediterranean SST record (reflecting North Atlantic climate®) 
on our new chronology with the EDC CH, record” on AICC2012 
(Fig. 2c). The remarkable signal similarity between 131 and 128 ka 
supports consistency between the two independent chronologies 
across this interval. Such consistency plausibly extends back to 
~134 ka, when a distinct SST maximum coincided with a CH, peak. 
Although the latter is documented by a single data point in EDC, the 
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Vostok Antarctic ice core CH, record™ appears to corroborate the 
occurrence of a CH, peak at this time. 

In the following we examine in millennial-scale detail the relative 
timing between insolation, North Atlantic climate, sea level, Antarctic 
temperatures, and atmospheric CO, across HS11 and the transition 
into the last interglacial period (Fig. 3a-f). T-II contains two major 
meltwater pulses? (MWP-2A and MWP-2B), which are centred on 
139 + 1 and 133 + 1 ka (20 uncertainties*) and coincided within 
uncertainties with two North Atlantic cooling episodes (Fig. 3c, d). 
MWP-2A indicates an early phase of ice-sheet retreat. Similar to its T-I 
counterpart’ at ~19 ka, it coincided with a short-lived (~2 °C) North 
Atlantic cooling. MWP-2B is more convincingly resolved and marks a 
steep ~70 m sea-level rise (~70% of the glacial—interglacial change) at 
rates of 28 + 8mkyr ' (refs 8, 9). It lagged 4 + 1 kyr behind the boreal 
summer insolation rise and coincided with the prominent North 
Atlantic cold phase, HS11 (Fig. 3b-d). Our new chronology places 
the initiation and termination of HS11 at 135 + 1 and 130 + 2 ka 
(20 uncertainties), respectively, and the ~0.5-kyr warm interlude® that 
briefly interrupted it at 134 + 1 ka (Fig. 3c), in overall agreement with 
the ‘synthetic’ Greenland record’® (Extended Data Fig. 3). HS11 also 
coincided with over half of the glacial-interglacial atmospheric CO, 
increase, and with ~9 °C Antarctic warming” (Fig. 3e, f). Only ~5 °C 
of this warming can be ascribed to radiative forcing”, which leaves 
~A4°C of ‘residual’ warming that we interpret as the Southern Ocean 
bipolar temperature seesaw response’’ to meltwater-forced AMOC 
collapse and attendant North Atlantic cooling (Fig. 3c, d). 

An intense seesaw response with strong North Atlantic cooling 
suggests that the meltwater entered the North Atlantic. This inference 
is supported by major North Atlantic IRD deposition (Fig. 3b), the 
large magnitude (~70 m) sea-level rise associated with MWP-2B 
(maximum freshwater flux of 0.3 + 0.04 Sv; Fig. 4a) that requires 
intense reduction of the large northern ice-sheets, and the distinct 
source-water-related negative shift in 5'°O of precipitation 
(AB "Os ccinitatian) at Corchia Cave (Fig. 4b), interpreted here as a 
reflection of meltwater-based freshening of the wider North Atlantic 
(Methods). The seesaw response at the time of HS11 is also corrobo- 
rated by its agreement with the established relationship between mag- 
nitudes of Antarctic warming and North Atlantic stadial duration 
during the last glacial cycle (Extended Data Fig. 4). Finally, HS11 
coincided with a weak Asian monsoon event'® and South American 
monsoon intensification”, both archived in radiometrically dated 
speleothems and interpreted in terms of a southward shift of the 
Intertropical Convergence Zone (ITCZ) in response to AMOC weak- 
ening. The relative timing between ITCZ dynamics, AMOC, and 
North Atlantic climate is well documented across T-I and within the 


Figure 3 | Interhemispheric records of climate change across glacial 
Termination II. a, Boreal and austral summer insolation curves; age is 
calculated using astronomical solutions*". b, Ice-rafted-debris (IRD) record 
from core MD01-2444, Iberian Margin’’ (North Atlantic). c, Sea surface 
temperature (SST) record from ODP976, western Mediterranean Sea°. The 
95% confidence limits (light grey envelope) and probability maximum (heavier 
grey line) with 95% uncertainty (heavier grey envelope) of the SST data 

(grey circles) are based on a Monte Carlo analysis of chronological and SST 
uncertainties, employing a 0.2 kyr Gaussian filter (Methods). Profiles in b and 
c are on the radiometrically constrained chronology developed in this study 
(Methods). d, Rates of relative sea-level change (Szs1/5,) with associated 95% 
confidence limits (magenta shaded envelopes) of the probability maximum 
(magenta solid line) from Monte Carlo analysis of uncertainties in the relative 
sea-level reconstructions and chronology’. e, Composite atmospheric CO 
concentrations from EPICA Dome C (EDC) (Methods). f, Antarctic air 
temperatures’* (AT, ;,) from EDC. The 95% confidence limits (light blue 
envelope) and probability maximum (heavier blue line) and its associated 95% 
uncertainty (heavier blue envelope) for the AT data (blue circles) are based 
ona Monte Carlo analysis of chronological and AT uncertainties, employing a 
0.2 kyr Gaussian filter (Methods). Data in e and f are plotted on the AICC2012 
timescale” (Methods). HS11, Heinrich Stadial 11; MWP, meltwater pulse. 
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Figure 4 | Origin and impacts of meltwater pulses during glacial 
termination II. a, Freshwater fluxes calculated (Supplementary Information) 
from rates of sea-level change’. b, Ice-volume and temperature corrected 580 
of Corchia Cave speleothem (Methods), indicating isotopic anomalies of 
precipitation in the cave catchment. Envelopes are the 95% confidence limits of 
the probability maximum (solid line) obtained from a Monte Carlo approach 
that takes into account proxy related and chronological uncertainties of the 
various records (Methods). c, Relative Sea Level (RSL)’. The 95% confidence 
limits (light magenta) and the probability maximum with 95% uncertainty 
(heavy magenta line and shading) were presented in ref. 9. d, Antarctic heat 
loss/gain obtained by integrating the Antarctic temperatures’* over a period of 
0.75 + 0.15 kyr, using a Monte Carlo approach employing a 0.8 kyr Gaussian 
filter of the data (blue). 


last glacial cycle”*. Similar scenarios were suggested for previous ter- 
minations’* (including T-II), but required assumptions about timing 
relationships between records. By placing all records on a single radio- 
metrically constrained chronology, our results overcome this limita- 
tion, and independently corroborate the scenario for T-II. 

Since ~70% of the glacial-interglacial sea-level rise (MWP-2B) 
occurred during a phase of weak AMOC (HS11; Fig. 3b-d), T-II fun- 
damentally differed from T-I. During T-I, ~75% of the sea-level rise’! 
post-dated the major deglacial cooling phase in the North Atlantic 
(HS1), and the largest meltwater pulse (MWP-1A; 15-20% of the 
deglacial sea-level rise) peaked when the North Atlantic was warm 
(or warming) and AMOC was relatively strong (or strengthening)*""”’. 
This fuelled arguments that Antarctic ice sheets contributed substan- 
tially to MWP-1A (ref. 10). In contrast, we find that MWP-2B was 
more than three times larger than MWP-1A, and was tied directly 
to circum-North-Atlantic ice-sheet reduction and attendant North 
Atlantic cooling. This fundamentally different relationship between 
North Atlantic climate and sea-level change during the last two 
terminations indicates that during T-II, Northern Hemisphere ice 
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sheets collapsed early in the termination, possibly in response to a 
combination of overall stronger (and more rapidly rising'*) boreal sum- 
mer insolation and higher atmospheric CO, (Extended Data Fig. 1). 

Our reconstructed phasing of interhemispheric climate develop- 
ments through HS11 is consistent with a southward shift and/or pos- 
sible strengthening of Southern Ocean westerlies*®, and associated 
subsurface oceanic warming around Antarctica”. This provides a 
mechanism for conveying the oceanic heat related to the bipolar see- 
saw to Antarctica. The attendant excess warming of ~4°C in 
Antarctica” (see above) during HS11 would have then affected the 
Antarctic ice sheet stability through processes such as under-ice melt- 
ing and grounding line retreat of ice-shelves*’. To account for relatively 
long (but poorly known) adjustment timescales of this major ice sheet 
to warming, including the excess 4 °C, we approximate heat gain/loss 
by integrating the Antarctic temperature record over time’’. We 
find that the timings of maximum heat gain and the last interglacial 
sea-level peak® match best for an overall integration timescale of 
0.75 + 0.15 kyr (Fig. 4c). 

Securing records to a single, radiometrically constrained chronology 
reveals the sequence of events through T-II. An initial (minor but 
significant) ice-sheet reduction caused MWP-2A during a Northern 
Hemisphere insolation minimum. Next followed the main phase of 
northern ice-sheet reduction (MWP-2B), ~4 kyr after the onset of 
insolation rise, which caused AMOC collapse, with attendant North 
Atlantic cooling and (seesaw) Southern Ocean warming. We infer that 
resultant Antarctic heating drove continuation of sea-level rise well 
after MWP-2B, up to several metres above present’, which predomi- 
nantly reflects Antarctic ice reduction. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Core material. Ocean Drilling Program (ODP) Site 975 was drilled by RV Joides 
Resolution in the western Mediterranean Sea (Fig. 1; 38° 53.8’ N, 4° 30.6’ E; water 
depth 2,415 m). ODP Site 975 is located on the Menorca Rise”, along the path of 
the Atlantic surface water entering the Mediterranean Sea through the Strait of 
Gibraltar and flowing eastward across the basin’*-**. Sections 2H4 to 2H6 from 
ODP Site 975 Hole C were sampled using u-channels, and subsequently sliced at 1 
cm resolution. Data were generated for this study at a resolution of 1 to 2 cm. 
Stable oxygen (5'8O) and carbon (5'°C) isotopes. For ODP Site 975, 20-30 
individuals of G. bulloides (300-355 kum size fraction) and 10-20 individuals of 
N. pachyderma (d) (212-250 tum size fraction) were picked. For core LC21, 10-15 
individuals of N. pachyderma (d) (250-300 jm size fraction) were picked to 
increase the resolution and spliced together with previously published records*”” 
between 143 and 129 ka. To avoid complications arising from size-dependent vital 
effects in planktic foraminifera**”’, we constrained the size window of the picked 
foraminifera from ODP Site 975 and core LC21 to a maximum of ~55 tm. This 
compromised the continuity of the G. bulloides 8'8O and 81°C records between 
129 and 127 ka in ODP Site 975, as G. bulloides abundance drops significantly in 
that interval, and insufficient numbers were available in one strict size window. 

Prior to analysis, foraminiferal samples were crushed, cleaned and ultrasoni- 

cated briefly in methanol. The cleaned samples were then analysed (8'8O and 
35°C) at the Australian National University with a Thermo Finnigan Delta 
Advantage mass spectrometer coupled to a Kiel IV carbonate device, in which 
samples react with 105% phosphoric acid at 75°C. Results were normalized to 
the Vienna Peedee Belemnite (VPDB) scale with the NBS-19 (8'8O = —2.20%o, 
38°C = 1,95%o) and NBS-18 (5'°O = —23.20%o, 5'°C = —5.014%o) carbonate 
standards. External reproducibility (1a) of a carbonate standard (NBS-19) was 
better than +0.08%o for 5'°O and +0.06%o for 81°C. 
Chronology for ODP Site 975. Synchronization of western Mediterranean ODP 
Site 975 to eastern Mediterranean core LC21 exploits the strong oceanographic 
connectivity between the two Mediterranean sub-basins at intermediate depths. 
Vigorous intermediate circulation transports water westward from the Rhodes 
Gyre (nearby core LC21, Fig. 1), to the western Mediterranean, and eventually 
through the Strait of Gibraltar as Mediterranean outflow into the Atlantic Ocean 
(for example, refs 33-36). Data*® and numerical modelling*’ indicate that this 
intermediate-water connection is an enduring feature of Mediterranean circula- 
tion through time. Accordingly, characterization of past geochemical property 
changes in intermediate waters across the Mediterranean provides a valuable tool 
for synchronizing sediment cores from different sectors of the basin. To character- 
ize variability of the intermediate waters at ODP Site 975 and core LC21, we use 
new (see above) and existing**” 8'8O and 8'°C data for the upper intermediate- 
water dwelling** planktic foraminifer N. pachyderma (d). 

Based on the enduring intermediate-water connection, the 5'8On, pachyderma (d) 
profile of ODP Site 975 is synchronized with that of LC21 (Fig. 2a, Extended Data 
Figs 2, 5), for which a radiometrically constrained chronology was recently 
developed, based on a strong surface-water 5'°O relationship with 5'SO in spe- 
leothems (8 Ossaieottien) in Soreq Cave (Israel)*. Specifically, we identified ten 
380, pachyderma (d) Shifts, each represented by multiple (ranging from 3 to 34) data 
points, and used their mid-points to place the ODP Site 975 records onto the LC21 
chronology of ref. 8 (Fig. 2a, Extended Data Fig. 5a). We linearly interpolated ages 
between age control points, assuming constant sedimentation rates between them. 
We adopted this approach because there is no evidence to suggest highly variable 
sedimentation rates at ODP Site 975. In addition, this assumption does not affect 
the age of the individual tie-points and hence our conclusions, given that the 
timing of HS11 is constrained by four age control points. 

The ODP Site 975 and LC21 8'°Oy, pachyderma (a) profiles are virtually identical 
between ~143 and ~132 ka. An abrupt shift to lighter values at ~132 ka 
follows at ODP Site 975, while it is not observed in LC21. Next, the ODP Site 
975 8'°On, pachyderma (a) Stabilizes at values of ~1.6%o before shifting again at 129 
ka (as does its LC21 counterpart) to reach a distinct minimum 128 kyr ago. The 
interval between 132 and 130 ka coincides with the strongest phase of Heinrich 
Stadial 11, when, according to our analysis, the meltwater discharge into the 
North Atlantic Ocean reached its maximum (see main text). This delivered 
large volumes of isotopically light freshwater to the North Atlantic and (via 
the Strait of Gibraltar) to the western Mediterranean Sea. Similar to (smaller) 
events during the last glacial cycle’, such influxes would have affected intermedi- 
ate levels of the water column, for example, through winter mixing in the north- 
western Mediterranean*’, where ODP Site 975 is located (Fig. 1). Between 128 
and 121 ka, the ODP Site 975 and LC21 8'8Oy. pachyderma (a) tecords diverge 
somewhat (although both become more positive at 126-123 ka). This is 
probably due to enhanced stratification in the eastern Mediterranean during the 
monsoon-related freshwater inputs along the North African Margin*®°****°, 


From ~121 to ~117 ka, there is a similar magnitude positive shift in the ODP 
Site 975 and LC21 5'8Oy, pachyderma (a) Tecords. 

Our inferred synchronization of marine sediment records and the accuracy of 
the radiometrically constrained chronology we derived for ODP Site 975 can 
be independently validated (Extended Data Fig. 2). First, we did not use the 
53Cy, pachyderma (a) Profiles for ODP975 and core LC21 to constrain the synchron- 
ization between the two sites but use those data on the synchronized age model for 
validation purposes (Extended Data Fig. 5b). There is a remarkable agreement 
between the ODP975 and LC21 8'°Cy, pachyderma (a) Profiles particularly between 
129 and 126 ka and between 123 and 120 ka, when both records have considerably 
lighter values than in the preceding glacial and deglacial periods (Extended Data 
Fig. 5b). The 5'°Cy, pachyderma (a) Minimum is a characteristic feature of intervals of 
sapropel deposition in the eastern Mediterranean and testifies to subsurface accu- 
mulation of isotopically light respiration products in response to decreased deep 
sea ventilation** °°’. With an isotopic shift in excess of —3%o, this feature is 
particularly prominent during last interglacial sapropel S5 (ref. 38), which is well- 
documented*”**** and precisely dated* (~128 to ~121 ka) in core LC21. 
Contemporaneous PCy pachyderma (a) Minima in both core LC21 and ODP Site 
975 during S5, therefore, supports our synchronization, in that they indicate that a 
low-6'°C anomaly has been effectively transferred via intermediate waters from 
the eastern to the western Mediterranean basin, in line with previous studies”. 
The ~2 kyr shift to heavier 3B Cy. pachyderma (a) Values between 126 and 124 ka at 
ODP Site 975 reflects an interval of potentially weakened sapropel conditions in 
the eastern Mediterranean during which deep-sea ventilation may have intermit- 
tently resumed*”“5 and the transport of low-8'°C water to the western basin 
interrupted. Second, between 140 and 129 ka there is a strong similarity between 
changes in the ODP Site 975 5'°O@ buttoides and 5'8O speleothem from Corchia Cave 
(Fig. 2b). This indicates that the large negative shift in the 5'8O.peleothem between 
~135 and ~130 ka resulted mostly from a '*O-depletion of similar magnitude 
in the source of precipitation for Corchia, which at present” and on glacial- 
interglacial timescales*' is the wider North Atlantic Ocean/western Mediterra- 
nean Sea. Note that source water influence on 8" Ospeiecthém is a common feature 
of the Mediterranean cave records***°**’. This interpretation is further corrobo- 
rated by the fact that increase in speleothem growth rates and decrease in 
speleothem 81°C did not occur before ~130 ka’ and by a similarly timed 
growth rate increase in a nearby cave**. Concurrence of these climate signals in 
the mid-latitude speleothems indicates a transition from drier/colder to wetter/ 
warmer conditions’ that therefore occurred at the end of the large negative 
3" O speleothem shift, that is, after 130 ka. 

Chronology for ODP Sites 976 and 977 and core MD01-2444. The 806 
bulloides record for core ODP Site 975 is used to transfer our new radiometrically 
constrained chronology of ODP Site 975 to 5806 pulloides in other western 
Mediterranean (ODP Sites 976 and 977) and Iberian Margin (North Atlantic, 
MD01-2444) sediment cores (Fig. 2b, Extended Data Figs 2, 3a, c). Alignment 
of co-registered (same sample series) alkenone-based sea surface temperature 
(SST) records from ODP Site 977 and core MD01-2444 with that from ODP 
Site 976 lends credence to the robustness of the 8'°O@. puttoides- based synchron- 
ization (Extended Data Fig. 5b, d). 

Propagation of chronological uncertainties. Propagation of chronological 
uncertainties through the various synchronization steps used in this study to 
transfer the chronology of core LC21 (ref. 8) to ODP Site 975 and then to ODP 
Site 976, ODP Site 977, and core MD01-2444 follows the exact same approach 
used by Grant et al. (ref. 8). The steps are outlined below. 

(i) From core LC21 to ODP Site 975. For LC21 the chronological uncertainty at 
each depth level was obtained by linearly interpolating the uncertainties between 
the tie-points used for transferring the radiometric chronology of Soreq Cave to 
LC21 (ref. 8). Next we calculate root-mean-square errors (MSE) that propagate all 
the chronological uncertainties involved in the synchronization of ODP Site 975 to 
LC21. Specifically, we include the dating error associated with the LC21 chro- 
nology of Grant et al. (ref. 8), the sample spacing of the 5'°On, pachyderma (d) record 
in core LC21, the sample spacing of 3180, pachyderma (a) in ODP Site 975, and extra 
uncertainty (2 or 3 times the sample spacing) for more ambiguous tie-points 
(indicated in red in the Extended Data Table 1). 

(ii) From ODP Site 975 to ODP Site 976. For ODP Site 975, the chronological 
uncertainty at each depth level was obtained by linearly interpolating the uncer- 
tainties between the tie-points used for transferring the radiometrically con- 
strained chronology of LC21 to ODP Site 975. Next we use MSE that propagate 
all the chronological uncertainties involved in the synchronization of ODP Site 
975 to ODP Site 976. This includes the ODP Site 975 chronological uncertainty 
from (i), the sample spacing of 381806 bulloides in ODP Site 975, the sample spacing 
of 5806. buttoides in ODP Site 976, and extra uncertainty (2 or 3 times the sample 
spacing) for more ambiguous tie-points (Extended Data Table 2). 
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(iii) From ODP Site 975 to ODP Sites 977. We include the ODP Site 975 chro- 
nological uncertainty (as reported in (ii)), the sample spacing of 81806. bulloides iN 
ODP Site 975, the sample spacing of 81806 buttoides in ODP Site 977, and extra 
uncertainty (2 or 3 times the sample spacing) for more ambiguous tie-points. 
These uncertainties were propagated as root-mean-square errors (MSE) 
(Extended Data Table 3). 

(iv) From ODP Site 975 to core MD01-2444. We include the ODP Site 975 
chronological uncertainty from (as reported in (ii)), the sample spacing of 
31806 buttoides in ODP Site 975, the sample spacing of 3'8O6. bulloides in core 
MD01-2444, and extra uncertainty (2 or 3 times the sample spacing) for more 
ambiguous tie-points. These uncertainties were propagated as root-mean-square 
errors (MSE) (Extended Data Table 4). 

Records of atmospheric CO, and CH, and Antarctic temperatures. Existing 
records of atmospheric CO2 and CH, concentrations and Antarctic atmospheric 
temperatures were placed on the latest ice-core chronology (AICC2012)***°. For 
atmospheric CO, concentrations (Fig. 3e), a composite was made of the records by 
Landais et al. (ref. 56) and Schneider et al. (ref. 57). 

Probabilistic assessment of the time series. We performed a probabilistic assess- 
ment of uncertainties for several of the new and previously published proxy 
records discussed in our study (for example, Figs 2b, c, 3c, f) in order to evaluate 
their confidence levels, taking into full account the uncertainties associated with 
both their chronologies (X) and proxy measurements/calibrations (Y). This ana- 
lysis of the time series relies on Monte Carlo-style simulations of the input data 
using MATLAB. For each time series, all individual data points are separately 
and randomly sampled 10,000 times within their X- and Y-uncertainties. The 
“Y-uncertainties’, related to the proxy reconstructions (for example, sea surface 
temperature), are taken from the original source (for example, Martrat et al., refs 6, 
58, 59). The “X-uncertainties’, related to the chronologies of the various records, 
are either taken from the error propagation discussed above (Mediterranean and 
North Atlantic records) or from the original source (Corchia Cave’, sea-level*”, 
and Antarctic ice core records”). Each of the 10,000 iterations was then either 
linearly interpolated (for example, Fig. 2b) or smoothed using a Gaussian filter of 
defined time window (for example, Fig. 2c). Next, the probability distribution of 
the 10,000 iterations was assessed per time step, which allowed determination of 
the 68% (16th-84th percentile) and 95% (2.5th-97.5th percentile) probability 
intervals for the data. Finally, we determined the probability maximum (modal 
value) at each time step, and its uncertainties (notably, the 95% probability interval 
for the probability maximum). For details on this approach, see refs 8, 60. 
Calculation of freshwater fluxes from rates of sea-level change. Code availabil- 
ity. Freshwater fluxes shown in Fig. 4a were calculated from rates of sea-level 
change’, using a script that was developed in MATLAB and is available in the 
Supplementary Information. 

Relationships between sea level and seawater oxygen isotopes. The response of 
5'°O in planktic foraminifera from the eastern Mediterranean Sea to sea-level 
change has recently been quantified®. This builds on the notion that sea level 
controls water exchange between the Mediterranean Sea and the Atlantic Ocean 
through the Strait of Gibraltar and, in turn, the residence time of waters in the 
basin® ©. Because of the highly evaporative conditions over the Mediterranean 
region, a longer residence time of water leads to higher salinity and 5'°O of sea- 
water, which is, in turn, archived in the 5'80 of foraminiferal calcite**. Here we 
use results from the model of ref. 60 to decipher the ‘glacial '*O-enrichment’ in 
eastern Mediterranean seawater, which is approximated by a second-order poly- 
nomial regression (Extended Data Fig. 6). Note that the generous uncertainties 
associated with the regression are mostly systematic and therefore have minor 
impacts on the relative seawater 5'°O (8'"O,cawater) Shifts assessed here. Next, we 
(re)calculate the relationship between sea-level lowering and mean open-ocean 
5'8O,cawater due to freshwater loss from the ocean to the continental ice sheets 
during glacial times. This was done before®’, but here we use the latest sea-level 
assessment for the last glacial maximum’ (LGM) and the contemporaneous 
oceanic 8'°O,cawater Value reconstructed from pore water analyses of deep-sea 
cores®, Finally, we evaluate the response of western Mediterranean 5'°O,cawater 
to sea-level change by employing a linear mixing model of the open ocean 
(30 + 5%) and eastern Mediterranean (70 + 5%) end-members at each sea-level 
value that probabilistically evaluates the uncertainties in the various regressions. 
This mixing ratio is consistent with the range of salinity gradients across the 
Mediterranean Sea under present-day and LGM boundary conditions™. 
Corchia Cave palaeoprecipitation 5'°O changes and source-water effect. To 
obtain a time series of the 5'8O anomaly of precipitation (A3"O, recipitation) in the 
catchment area of Corchia Cave between 145 and 115 ka (shown in Fig. 4b), we 
‘deconstructed’ the 5'Ogpeleothem into its main components. These include 
the cave temperature that controls the isotopic fractionation between dripping 
water and inorganic (speleothem) calcite*® and the 8'8O of precipitation 
(3'8O precipitation) The latter reflects the interplay of two independent factors: 
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(i) the 5'°O of the mean marine source of precipitation, for which variability 
depends on the glacial 180-enrichment (see above), and (ii) the residual, here 
referred to as Ad O precipitation component, which reflects variations in the 
isotopic composition of precipitation either due to inputs of isotopically light 
freshwater to the North Atlantic and (via Gibraltar) to the western 
Mediterranean Sea (for example, from melting ice sheets) or due to changes in 
the amount of precipitation’. 

Our approach relies on the assumption that the temperature change within 
the cave reflects mean annual temperature changes in the regional climate’, 
as reflected in the western Mediterranean SST. We also consider the propagated 
chronological uncertainties of the three records used to derive the ASO precipitation 
time series. To probabilistically evaluate the impacts of these assumptions and 
uncertainties on A8S™O precipitation: we have used a Monte Carlo approach (see 
above), in which the SST record’, the sea-level driven mean ocean 8!"O,eawaters 
and 3'8Ogpeteothem from Corchia Cave are given as input data along with 
the propagated chronological uncertainties. Note that calculation of the 
AS" Op recipitation using a glacial mean ocean 180Q-enrichment and a western 
Mediterranean “"O-enrichment are statistically indistinguishable across the inter- 
val of interest. This analysis is possible here for the first time because we placed all 
relevant records on the same timescale with rigorous assessment of the propagated 
uncertainties. 

The negative Ad’ Oprecipitation shift in Corchia Cave during HS11 had a mag- 
nitude of ~1.5%o (Fig. 4b). An order of magnitude mass-balance assessment 
suggests that a ‘*O-depletion by 1.5%o in the wider North Atlantic and western 
Mediterranean source areas for precipitation to Corchia Cave® is consistent with 
addition of ~0.3 Sverdrup (Sv, 10~° m? s_'; Fig. 4a) of isotopically light (assumed 
—40%o) meltwater to a 80-m-deep mixed layer with 5'°O of 1%b (ref. 69). This 
implies a mixing ratio between marine and meltwater end-members of 40:1 and 
scales the marine flux in the Subtropical Atlantic to approximately 12 Sv. Despite 
the roughness of our calculation this overall agrees with a complete collapse of the 
thermohaline component of the Gulf Stream in the Subtropical North Atlantic 
during HS11 and with the persistence of only the wind-driven marine transport, 
which today accounts for ~17 Sv (ref. 70). 
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Extended Data Figure 1 | Orbital parameters, insolation forcing, and 
atmospheric CO, concentrations during the last two glacial—interglacial 
transitions. a, f, Incoming solar radiation on 21 June and 21 December at 
65° N (orange) and 65° S (blue), for T-I (a) and T-II (f)*". b, g, Precession of 
the Earth’s axis*’ for T-I (b) and T-II (g). ¢, h, as b, g, but for eccentricity of 
the Earth’s orbit*’. Note the different axis scales in c and h. d, i, Obliquity of the 
Earth’s axis*' for T-I (d) and T-II (i). Different orbital configurations during 
glacial terminations I (T-I) and II (T-ID) led to markedly different insolation 
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forcing*’”’”, with the summer 65° N insolation increase during T-II larger 
by ~35% (~23 W m ~), and occurring at faster rates'*, than that during T-I. 
e, j, atmospheric CO, concentrations during T-I (refs 74-76) and T-II (ref. 56). 
Yellow bands illustrate the timing of Heinrich stadial 1 (HS1) and 11 (HS11), 
following previous literature’ and this study, respectively. Dashed blue lines 
highlight that atmospheric CO, concentrations were systematically higher 
during T-II than during T-I. Symbols indicate the maxima and/or minima in 
insolation, precession, eccentricity, and obliquity cycles. 
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Extended Data Figure 2 | Flowchart of the approach used to construct and validate chronologies of the various records. See main text and Methods for details. 
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Extended Data Figure 3 | Synchronization of western Mediterranean Sea 
ODP975 to ODP977 and Iberian Margin core MD01-2444. a, Synchroni- 
zation of G. bulloides 8'8O from western Mediterranean ODP977 to its 
counterpart from nearby ODP975. Circles and error bars depict tie points and 
20 synchronization errors, respectively (Methods, Extended Data Table 3). The 
G. bulloides 5'*O from ODP976 is only shown for comparison and is not used 
to transfer our new, radiometrically constrained chronology, to ODP977 
(Methods). b, Validation of the synchronization exercise shown in a by 
comparing the alkenone-based sea surface temperature (SST) records from 
ODP977 and ODP976, on their new, radiometrically constrained chronology. 
The 95% confidence intervals (light grey envelope) and probability maximum 
(heavier grey line) and its associated 95% confidence intervals (heavier grey 
envelope) of the ODP976 SST data (grey circles) are based on a Monte Carlo 
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analysis of chronological and SST uncertainties, employing a 0.2 kyr Gaussian 
filter (Methods). The synthetic record of Greenland climate variability’® is also 
shown. ¢, Synchronization of the G. bulloides '°O from Iberian Margin 
(Atlantic Ocean) core MD01-2444 to its counterpart from ODP975 (Western 
Mediterranean, Methods). Circles and error bars depict tie points and 20 
synchronization errors, respectively (Methods, Extended Data Table 3). The 
G. bulloides 5'8O from ODP976 is shown for comparison and is not used to 
transfer the MD01-2444 records to our new, radiometrically constrained 
chronology (Methods). d, Validation of the synchronization exercise shown 
in c by comparing alkenone-based SST records from core MD01-2444 and 
ODP976 (shaded envelopes and circles as in b), on their new radiometrically 
constrained chronology. The synthetic record of Greenland climate variability'® 
is also shown. 
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Extended Data Figure 4 | Relationship between the duration of the North 
Atlantic cold phases (stadials) and the magnitude of Antarctic warming. 
a, 5'°O,. time series from North Greenland Ice Core Project (NGRIP)”” (light 
blue). The probability maximum (solid blue line) and associated 95% 
confidence bounds (shaded blue envelope) of the 5180, record result from 
10,000 Monte Carlo simulations, > employing a 0.15 kyr Gaussian filter through 
the data and their chronological and 3'°Ojce uncertainties (see Methods). 

b, EPICA Dome C (EDC) temperature reconstructions (AT) based on 5Di,¢ 
data’? (light red). The probability maximum (solid red line) and associated 95% 
confidence bounds (shaded red envelope) of the temperature record result from 
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10,000 Monte Carlo simulations, employing a 0.2 kyr Gaussian filter through 
the data and their chronological and AT uncertainties. c, Comparison between 
Antarctic temperature reconstructions from EDC using the methods of 
Jouzel et al. (AT, ref. 13) and Stenni et al. (Tsrrg, ref. 78), revealing no difference 
in the amplitude of the estimated Antarctic temperature shifts. d, Relationship 
between duration of Greenland stadials and Antarctic warming for bipolar 
seesaw events highlighted in a and b by yellow bands. Greenland and Antarctic 
records are on the latest ice core chronology AICC2012 (ref. 55). Error bars 
depict 1 uncertainty in the magnitude and duration of North Atlantic cooling. 
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Extended Data Figure 5 | Synchronization of western Mediterranean Sea synchronize ODP975 to LC21 and the 2o synchronization uncertainties, 


ODP975 to eastern Mediterranean core LC21. a, Synchronization of the respectively. b, Validation of the synchronization exercise shown in a by 
N. pachyderma (d) 8'8O from ODP975 to its counterpart from eastern comparing the respective co-registered N. pachyderma (d) 8'°C profiles from 
Mediterranean core LC21 (Methods), which was placed on a radiometric the same cores. 


chronology by ref. 8. Red filled circles and error bars depict the tie points used to 
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Extended Data Figure 6 | Glacial '*O-enrichment in oceanic and 
Mediterranean seawater. Blue (open ocean), black (eastern Mediterranean 
Sea) and green (western Mediterranean Sea) lines illustrate the relationship 
between seawater 5'°O (5'8O,cawater) and relative sea level (RSL). The open 
ocean relationship has been re-calculated here using the latest assessment (grey 
symbol) of the sea-level lowstand" of the Last Glacial Maximum (26.5-19 ka, 
ref. 79) and the mean ocean 5'°O,cawater Value derived from porewater analyses 


ina suite of deep-sea cores®*. The eastern Mediterranean relationship is derived 
from the model presented in ref. 60. The western Mediterranean relationship 
(green line) reflects linear mixing between eastern Mediterranean (black 

line) and open ocean (blue line) end-members, assuming that the residence 
time effect? on 5'’O,cawater (and salinity) in the western Mediterranean is 
70 + 5% of that in eastern Mediterranean™. Shaded envelopes are lo 
confidence bounds, derived from probabilistic analysis of uncertainties. 
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Extended Data Table 1 | Synchronization of ODP975 to LC21 


Age (ka BP) ODP975 sample LC21 sample MSE MSE age 
spacing (kyr) spacing (kyr) synchronization uncertainty (kyr) 
(kyr) 
116.68 0.50 0.50 0.71 1.52 
121.04 0.04 0.04 0.17 1.76 
123.34 0.07 0.05 0.26 1.56 
128.48 0.22 0.22 0.31 1.20 
132.49 0.12 0.12 0.17 117 
133.94 0.04 0.08 0.09 0.59 
135.02 0.16 0.20 0.26 0.56 
137.16 0.09 0.22 0.24 0.93 
139.40 0.08 0.24 0.25 1.33 
142.28 0.11 0.15 0.19 1.50 


Sample spacing (1c) uncertainties in ODP975 and LC21 and dating (1c) uncertainties of LC21 (ref. 7) are combined in mean squared estimates (MSE). Tie-points in red include extra uncertainty (3 times the 
sample spacing). 
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Extended Data Table 2 | Synchronization of ODP976 to ODP975 


Age (ka BP) ODP976 sample ODP975 sample MSE MSE age 
spacing (kyr) spacing (kyr) synchronization uncertainty (kyr) 
115.63 0.18 0.23 0.29 1.54 
118.33 0.03 0.25 0.25 1.59 
122.24 0.26 0.18 0.31 1.76 
125.02 O21 0.36 0.41 152 
129.04 0.06 0.08 0.10 1.20 
132.45 0.07 0.08 0.11 1.18 
133.98 0.14 0.08 0.16 0.76 
135.08 0.07 0.09 0.11 0.58 
137.34 0.08 0.08 0.12 0.92 
137.97 0.02 0.06 0.07 1.06 
139.36 0.06 0.06 0.08 1.30 
142.28 0.54 0.16 0.56 157 


Sample spacing (1c) uncertainties in ODP976 and ODP975 and dating uncertainties of ODP975 (see Extended Data Table 1) are combined in mean squared estimates (MSE). Tie-points in red include extra 
uncertainty (3 times the sample spacing). 
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Extended Data Table 3 | Synchronization of ODP977 to ODP975 


Age (ka BP) ODP977 sample ODP975 sample MSE MSE age 
spacing (kyr) spacing (kyr) synchronization uncertainty (kyr) 
115.63 0.23 0.23 0.32 1:55 
118.33 0.25 0.25 0.35 1.65 
122.24 0.60 0.18 0.63 1.77 
125.59 0.10 0.10 0.14 1.41 
129.04 0.08 0.08 0.11 1.20 
132.38 0.49 0.08 0.50 1.20 
134.07 0.34 0.08 0.35 0.72 
135.41 0.16 0.12 0.20 0.67 
137.34 0.13 0.08 0.16 0.98 
139.30 0.03 0.06 0.06 1.31 
142.28 0.09 0.05 0.10 1.47 


Sample spacing (1c) uncertainties in ODP977 and ODP975 and dating uncertainties of ODP975 (see Extended Data Table 1) are combined in mean squared estimates (MSE). 
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Extended Data Table 4 | Synchronization of core MD01-2444 to ODP975. 


Age (ka BP) MD01-2444 sample ODP975 sample MSE MSE age 
spacing (kyr) spacing (kyr) synchronization uncertainty (kyr) 
118.930 0.075 0.150 0.168 1.640 
129.035 0.079 0.092 0.121 1.203 
132.300 0.077 0.077 0.108 1177 
133.955 0.227 0.101 0.249 0.718 
134.810 0.065 0.065 0.092 0.576 
137.885 0.108 0.043 0.116 1.031 
139.165 0.015 0.045 0.047 1.281 


Sample spacing (1c) uncertainties in MD01-2444 and ODP975 and dating uncertainties of ODP975 (see Extended Data Table 1) are combined in mean squared estimates (MSE). 
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Experimental constraints on the electrical anisotropy 
of the lithosphere-asthenosphere system 


Anne Pommier!?, Kurt Leinenweber’, David L. Kohlstedt*, Chao Qi4, Edward J. Garnero’, 


Stephen J. Mackwell? & James A. Tyburczy” 


The relative motion of lithospheric plates and underlying mantle 
produces localized deformation near the lithosphere-astheno- 
sphere boundary’. The transition from rheologically stronger 
lithosphere to weaker asthenosphere may result from a small 
amount of melt or water in the asthenosphere, reducing viscos- 
ity’*. Either possibility may explain the seismic and electrical 
anomalies that extend to a depth of about 200 kilometres*’. 
However, the effect of melt on the physical properties of deformed 
materials at upper-mantle conditions remains poorly constrained®. 
Here we present electrical anisotropy measurements at high tem- 
peratures and quasi-hydrostatic pressures of about three gigapas- 
cals on previously deformed olivine aggregates and sheared 
partially molten rocks. For all samples, electrical conductivity is 
highest when parallel to the direction of prior deformation. The 
conductivity of highly sheared olivine samples is ten times greater 
in the shear direction than for undeformed samples. At tempera- 
tures above 900 degrees Celsius, a deformed solid matrix with 
nearly isotropic melt distribution has an electrical anisotropy fac- 
tor less than five. To obtain higher electrical anisotropy (up to a 
factor of 100), we propose an experimentally based model in which 
layers of sheared olivine are alternated with layers of sheared oliv- 
ine plus MORB or of pure melt. Conductivities are up to 100 times 
greater in the shear direction than when perpendicular to the shear 
direction and reproduce stress-driven alignment of the melt. Our 
experimental results and the model reproduce mantle conduc- 
tivity-depth profiles for melt-bearing geological contexts. The 
field data are best fitted by an electrically anisotropic astheno- 
sphere overlain by an isotropic, high-conductivity lowermost litho- 
sphere. The high conductivity could arise from partial melting 
associated with localized deformation resulting from differential 
plate velocities relative to the mantle, with subsequent upward melt 
percolation from the asthenosphere. 

Electromagnetic profiles of the lithosphere—asthenosphere system 
reveal zones of high electrical conductivity and electrical anisotropy, 
which vary with depth (Fig. 1)°”*. High conductivity can be attributed 
to several factors, including the presence of an interconnected fluid 
phase’. Regions of electrical anisotropy are usually attributed to mantle 
deformation that can result from the motion of rigid lithospheric 
plates relative to the underlying convecting mantle**. Experimental 
investigations under controlled laboratory conditions allow a direct 
assessment of the effect of deformation and chemistry on electrical 
conductivity, an important step in investigating the dynamic coupling 
of tectonic plates with the underlying mantle. 

The current laboratory-derived database of electrical anisotropy of 
mantle materials consists principally of measurements of electrical 
conductivity o for different crystallographic orientations of dry and 
hydrous olivine single crystals'*''. Only one set of measurements has 
been made for o of melt-bearing olivine aggregates during torsion”, 
and these experiments were performed at low crustal pressure 


(0.3 GPa) and only to low shear strain (y < 0.5-1), limiting the forma- 
tion of noticeable melt bands. Recently, new electrical measurements 
on melt + olivine aggregates were performed during simple shear at 
3 GPa, but only small strains (y < 1.8) were reached”’. 

Here we report the results of laboratory experiments at astheno- 
spheric pressure (about 3 GPa) and on samples previously deformed to 
high strains (y ~ 9). These experiments were designed to investigate 
electrical anisotropy in deformed mantle materials, in order to develop 
an electrical model of the upper mantle to be compared with field 
results. The electrical anisotropy of mantle materials was investigated 
by measuring the electrical conductivity of previously deformed sam- 
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Figure 1 | Anisotropic conductivity models obtained from inversion of 
electromagnetic data in different geological contexts. The conductivity ratio 
is for perpendicular directions o, and o,, with o, the direction of highest 
conductivity in the asthenosphere. Electrical profiles are for the Pacific Ocean 
Basin (off the Mariana Trench) in grey, the East Pacific Rise (EPR, 200 km off 
the ridge axis) in red, and the Cocos plate (250 km off trench) in blue*”*. 
Anisotropy in the lithosphere under the Cocos plate is not well constrained. All 
profiles present a sublithospheric or asthenospheric region of high anisotropy, 
topped with a low-anisotropy layer. 
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ples in a multi-anvil apparatus at about 3 GPa and up to 1,300°C 
in two perpendicular directions using impedance spectroscopy 
(Extended Data Fig. 1). 

Dry polycrystalline olivine samples (Fogo), a Foog + 5 vol% MORB 
sample, and a Fogg + 2 vol% NaKCO3; melt sample were used. These 
samples were initially deformed in triaxial compression or in simple 
shear (Methods and Extended Data Table 1) at a confining pressure 
P = 0.3 GPa and a temperature T = 1,200-1,250 °C in a gas-medium 
apparatus. The parts of the samples that had experienced maximum 
shear strain were extracted and placed in a conductivity cell in the 
multi-anvil apparatus (Extended Data Table 1) for measurements of 
electrical properties at a pressure representative of the pressure of the 
asthenosphere. 
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Measurements were primarily conducted under quasi-hydrostatic 
conditions at temperatures not exceeding 900°C with a few 
experiments at temperatures near 1,300°C (Methods). In the low- 
temperature conductivity runs, melt-enriched sheets (bands in two 
dimensions) formed during deformation runs are preserved as prev- 
iously observed'*"*, whereas after the high-temperature conductivity 
experiments the melt-rich bands are no longer detected (Extended 
Data Fig. 2). Instead, melt-bearing samples quenched from high- 
temperature measurements (>~1,100 °C) exhibited a sheared solid 
matrix with an approximately homogeneous (meaning a lack of melt- 
rich bands) and isotropic melt distribution, because in the high- 
temperature conductivity experiments, surface tension relatively 
quickly redistributes the melt, both in terms of dissipation of the 
melt-rich bands and in terms of randomization of the orientation of 
melt-filled triple junctions, thus removing the melt preferred orienta- 
tion introduced by deformation’*”. 

Nonetheless, a small degree of heterogeneity in melt distribution 
remained, with some areas showing slightly higher melt concentra- 
tions than others. This observation suggests that some structural 
anisotropy associated with the deformation-induced lattice preferred 
orientation persists for the duration of our conductivity measure- 
ments, in agreement with previous observations on sheared samples 
that were statically annealed after large strain deformation’*. Chemical 
analyses of run products are consistent with starting compositions 
(Extended Data Table 2). 

Because the temperature of the lithosphere-asthenosphere system is 
not well constrained, our experiments investigated the electrical prop- 
erties of mantle analogues over a wide range of temperature. For 
olivine samples, a single Arrhenius relationship can satisfactorily 
reproduce the electrical data over the investigated temperature range 
(Fig. 2a; Extended Data Table 3). For both MORB-bearing and car- 
bonate-bearing samples, a slight bowing in the trend of electrical con- 
ductivity with temperature is observed at about 870-900 °C, possibly 
caused by textural changes. Electrical data are therefore best repro- 
duced by two Arrhenius equations, one on either side of this temper- 
ature (Fig. 2b; Extended Data Table 4). For experiments performed at 
low temperature (<900 °C) on melt-bearing samples, values of o at 


Figure 2 | Electrical conductivity-temperature diagrams of experimental 
results. a, Results for olivine aggregates at low and high shear strains. Open and 
closed circles represent data parallel to and perpendicular to the main 
deformation direction, respectively. Our data agree well with grey zone (1), 
which corresponds to o of olivine aggregates under pressure”®””. Electrical data 
on olivine single crystal in different crystallographic directions (zone (2), 
with water content less than 430 H atoms per 10° Si atoms) are from ref. 10. 
b, Results for the melt-bearing systems. Open and closed symbols represent 
data parallel to and perpendicular to the main deformation direction, 
respectively. Only the electrical measurements of carbonate-bearing samples 
(diamonds) and of the MORB-bearing sample with a strain of 1.2 in the 
direction parallel to the main deformation (open black circles) were performed 
over a large temperature range from high to low temperature. All other 
experiments were performed either at high (>1,100 °C) or at low (<900 °C) 
temperature (Extended Data Table 4). Good agreement is obtained with data 
from zone (1)'*. The slight difference in the results from zone (2) at 0.1 MPa 
(ref. 28) is consistent with the effect of pressure. The conductivities from refs 29 
and 30 (zones (3), (4), and (5)) are higher than the ones measured in this 
study, possibly owing to melt chemistry effects. The dashed lines are Arrhenius 
relations for conductivity in the post-glass transition temperature range. 
Electrical anisotropy decreases with increasing temperature to become 
negligible at high temperature. c, Laboratory-based isotropic (blue colours) and 
anisotropic (red colours) models of the electrical conductivity of deformed 
mantle materials. Models are based on Arrhenius equations and on 
extrapolations of our results from low- to high-temperature experiments. 
Isotropic models for MORB-bearing samples at >1,200 °C are based on the 
Hashin-Shtrikman upper bound. Layered model 1 considers layers of 
polycrystalline olivine alternating with layers of sheared Fogg + 5 vol% MORB, 
and layered model 2 considers layers of polycrystalline olivine alternating with 
layers of basalt”” (see Methods). 
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higher T can be calculated by extrapolating electrical data collected 
above the glass transition (at about 670 °C) (dashed lines in Fig. 2b; 
Extended Data Table 4). Although this extrapolation to high temper- 
ature does not account for the slight bowing in the slope observed at 
around 870-900 °C for high-temperature samples, it provides a con- 
straint with the least amount of uncertainty on conductivity—strain 
relationships because the melt phase did not redistribute in these 
low-temperature experiments. It also satisfactorily reproduces the 
measurements of high-temperature samples performed up to 
1,300 °C (sample PT0683, Fig. 2b). In fact, the electrical conductivity 
of melts with a low degree of polymerization shows no major change in 
activation energy at temperatures above the glass transition temper- 
ature to 1,300 °C (ref. 19), and previous experimental studies have 
demonstrated that a single activation energy describes the electrical 
conductivity of olivine over the temperature range of interest, 
700-1,300 °C (ref. 20). 

Our results, which quantify the effects of deformation and partial 
melt on the electrical conductivity of mantle rocks, are summarized as 
follows. (1) Deformation affects the electrical response of all the sys- 
tems investigated, including both melt-free olivine compacts and 
melt + olivine samples (Fig. 2). This observation contrasts with results 
from a previous study” that did not show anisotropy in sheared melt- 
free samples, possibly because the shear strains were too low (y < 1.8). 
The effect on conductivity anisotropy of a relatively small amount of 
strain in triaxial compression of olivine aggregates is small (a factor of 
<2) but detectable (Fig. 2a). Simple shear deformation in torsion 
of olivine aggregates to a relatively large shear strain (y ~ 3.5) results 
in an enhancement by a factor of about ten in electrical conductivity 
parallel to the direction of shear compared to an undeformed sample. 
For this sample (PT0264), the lattice preferred orientation of olivine is 
defined by [100] subparallel to the shear direction, and by (010) sub- 
parallel to the shear plane (also called type-A fabric)’. The lattice 
preferred orientation of the olivine grains is unlikely to explain the 
high bulk conductivity of this sheared sample, because electrical aniso- 
tropy in olivine single crystals is only about a factor of two at 1,200 °C 
(ref. 10), much smaller than our measured factor of ten. The high 
conductivity in the high-strain sample may be due to the contribution 
of grain boundaries, as the grain size in the sheared samples is smaller 
than in an undeformed sample owing to dynamic recrystallization and 


the grain boundaries are sympathetically oriented in the shear samples. 
The decrease in activation energy observed for most samples at high 
temperature (more than about 1,040 °C) may be an indication that the 
stress regime preserved from the deformation experiment has relaxed 
to a greater degree during the electrical experiment. However, the 
deformed texture of the sample was preserved during the entire experi- 
ment (Extended Data Fig. 2a). No systematic effect of shear strain 
(for 1.2 < y < 9) on conductivity was observed for melt-bearing 
samples (Fig. 2b). 

(2) The addition of a few volume per cent of basaltic or carbonate 
melt to a polycrystalline olivine slightly increases the bulk electrical 
conductivity (by 10%-30%). For all melt-bearing samples, electrical 
anisotropy is large at low T in the glass region and above the glass 
transition temperature T, up to 900 °C, but decreases at temperatures 
greater than 900°C, consistent with the disappearance of melt-rich 
bands in the run products during the high-temperature conductivity 
experiments (Extended Data Fig. 2). This low electrical anisotropy at 
high temperature indicates a reorganization of the melt network in the 
multi-anvil apparatus, despite the persistence of some degree of struc- 
tural anisotropy: the crystallographic preferred orientation of the oliv- 
ine grains was maintained, but the melt preferred orientation 
weakened (controlled by anisotropy in the melt-solid interfacial 
energy rather than by stress) and the melt-rich bands dissipated in 
response to surface tension, leaving a nearly isotropic melt distri- 
bution. Samples with 5 vol% MORB and 2 vol% carbonate yield similar 
values for conductivity at a given temperature. From their electrical 
response alone, all investigated materials can potentially explain high 
conductivities in the upper mantle (typically = 0.03 S m~'), emphas- 
izing that electrical conductivity alone cannot discriminate between 
the effect of bulk composition and deformation textures. 

On the basis of the results of our experiments, we propose models for 
the electrical conductivity in both anisotropic and predominantly iso- 
tropic environments (Fig. 2c). Melt-bearing samples provide analogues 
of a melt-bearing isotropic mantle and of melt-rich bands in a highly 
anisotropic mantle. The bulk electrical conductivity of a banded mantle 
is calculated using a layered model that consists of either series or parallel 
circuit equations, depending on the direction considered (Methods). In 
this model, melt-rich layers correspond to either a MORB-bearing 
material using our conductivity measurements or a basaltic material 
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Figure 4 | Cross-section portrayal of the 
electrical conductivity of the uppermost mantle 
in a melt-bearing context. The corresponding 
conductivity ratio for the direction of lowest (ay) to 
the highest (¢,.) conductivity is shown in red. Melt 
reaches the asthenosphere from the deeper mantle. 
The geometry and the distribution of melt 
pathways in the deeper mantle do not significantly 
affect electrical anisotropy. In the asthenosphere, 
the horizontal alignment (of sheets, tubules, and so 
on) is the dominant process and results from 
mantle flow so electrical anisotropy is enhanced as 
a result of plate motion. Vertical melt migration has 
an insignificant influence on the geophysical 
response. Melt accumulates at the bottom of the 
lithosphere because the lithosphere is less 
permeable, and becomes well interconnected in all 
directions despite a deformed solid matrix. This 
melt is isolated from mantle flow, cooling and 
crystallizing. 


based on published conductivity results”; melt-free layers correspond to 
either undeformed or sheared polycrystalline Fogo. 

Our results reproduce both the magnitude and the anisotropy of 
electrical conductivity in various melt-bearing contexts (Fig. 3). Near 
the East Pacific Rise, where a few volume per cent of melt is expected, 
the high anisotropy (Omax/Omin > 50, Fig. 3) detected at a depth of 
80-120 km is best reproduced by a laboratory-based layered model 
(melt-bearing layered model 2 in Fig. 3) that is consistent with plate- 
motion-related strains in the asthenosphere that induce alignment 
of melt and formation of melt-enriched sheets in the direction of 
plate motion. 

Under the Cocos plate, the main conductivity anomaly in the 
asthenosphere (45-60 km) has a small anisotropy (Gmax/Omin ~ 3; 
Fig. 3) despite the fact that the high speed of the plate (8.5 cm yr_') 
is expected to be accompanied by substantial strain in the underlying 
asthenosphere. Our layered models are too anisotropic to be compar- 
able to field observations under the Cocos plate (Fig. 3), suggesting that 
the geometric configuration of the melt phase under the Cocos plate is 
almost electrically isotropic. In highly deformed rocks, a low degree of 
anisotropy of the melt phase may appear paradoxical. However, this 
possibility can be reconciled at high temperatures, where stresses 
are lower and thus deformation produces a weaker crystallographic 
preferred orientation and melt preferred orientation than at lower T 
where stress is higher (to maintain the same strain rate). Temperatures 
exceeding 1,400 °C with a few volume per cent of hydrous melt have 
been previously suggested to explain these high conductivities’. Our 
experiments demonstrate that a temperature of around 1,300 °C is 
sufficient to explain the observed conductivities and that the weak 
anisotropy is reproduced by a sheared material in which melt pathways 
exist in all directions. 

Under the Pacific Ocean basin, at more than 100 km east of the 
Mariana Trench and at a depth of 150-250 km, the weak electrical 
anisotropy (Omax/Omin ~ 1.8, Fig. 3) has been attributed to mantle flow 
in the Mariana Trough’. As for the Cocos plate, the field data are best 
reproduced by an isotropic melt phase interconnected in a sheared 
solid matrix at high temperature (>1,200 °C). The extrapolation of 
Arrhenian equations based on our data on sheared olivine + 5 vol% 
MORB (Fig. 2b; Extended Data Table 4) to 1,350°C reproduces the 


electrical field data off the Mariana Trench, with electrical conductivity 
values of 6.4 X 10 * S m'‘ in the direction of deformation and 
2.7 X 10°-* § m7’ in the direction perpendicular to deformation. 

Regions of high anisotropy embedded between layers of low aniso- 
tropy that have been observed beneath mid-ocean ridges and oceanic 
plates (Fig. 1) can be related to vertical stratification in mantle rheology 
and fluid distribution (Fig. 4). Deep regions of low anisotropy are 
observed under the East Pacific Rise, the Cocos plate, and the Pacific 
Ocean basin (at depths of 140 km, 80 km and 250 km, respectively), 
consistent with an isotropic distribution of mantle phases and possibly 
scattered melt pathways towards shallower asthenospheric depths. At 
shallower depths, the decrease in electrical anisotropy and conduc- 
tivity observed for the regions considered (less than about 80 km depth 
off the East Pacific Rise, 40-45 km depth under the Cocos plate, and 
less than about 120 km depth off the Mariana Trench, Fig. 1) may 
involve processes associated with the coupling of tectonic plate motion 
and flow in the underlying mantle. Layers of low electrical anisotropy 
are consistent with a more isotropically distributed melt phase that 
governs the bulk conductivity. An interconnected liquid phase with a 
weak melt preferred orientation in a deformed matrix may also explain 
why some electromagnetic studies in other locations observe high 
conductivities but do not detect electrical anisotropy”’. This melt dis- 
tribution may arise from its accumulation at the lithosphere—astheno- 
sphere boundary and possible upward migration through dikes that 
propagate buoyantly into the lithosphere”. It is also possible that 
asthenospheric melt percolates into the bottom of the lithosphere, 
eventually cooling and crystallizing. If the long-term rheological beha- 
viour of crystallizing melt-rich structures results in increasing the 
viscosity in a zone at the base of the lithosphere, it can then reduce 
the efficiency of viscous coupling between the tectonic plates and the 
underlying flow of the mantle through time. 


Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Starting materials. Electrical measurements were performed on three anhydrous 
starting materials: polycrystalline olivine (Fo99), Fogg + 5 vol.% MORB, and 
Fog9 + 2 vol.% NaKCO; melt. These samples were deformed in triaxial compres- 
sion to a strain of approximately 0.1 or in simple shear to shear strains from 1.2 to 9 
(Extended Data Table 1) at a confining pressure of 0.3 GPa and temperatures of 
1,200 °C-1,250 °C in a gas-medium apparatus. The parts of the samples that had 
experienced maximum shear strain were extracted and placed in a conductivity 
cell in the multi-anvil apparatus (Extended Data Fig. 1a) for electrical measure- 
ments. Simultaneous monitoring of the deformation and the electrical response is 
not required to investigate the effect of deformation on the electrical conductivity 
and anisotropy in o of polycrystalline materials, as long as the same starting 
material is used to do the measurements in the different orientations. Note that 
the gas-medium apparatus used to deform the samples is operated at low pressure 
(0.3 GPa), whereas the multi-anvil apparatus can reproduce asthenospheric con- 
ditions (in our case, about 3 GPa). 

Multi-anvil experiments. All electrical experiments were performed at about 3 GPa 
in a multi-anvil apparatus using tungsten carbide cubes with a corner-truncation 
edge length of 8 mm and mullite octahedral pressure media with an edge length of 14 
mm. Graphite heaters were used, placed inside an outer zirconia sleeve that provided 
thermal insulation. Experimental samples were 2 mm in diameter and 0.8-1.5 mm in 
length, and were placed at the centre of the cylindrical heater inside an MgO sleeve. 
Two molybdenum disks (outer diameter 2 mm) were in contact with the sample, 
serving as electrodes. The temperature was monitored with a Wo;Res-W74Re2 
(C-type) thermocouple inserted within an MgO sleeve with the junction in contact 
with the top of one of the molybdenum disks (Extended Data Fig. 1b). A single 
tungsten-rhenium (W-Re) wire was connected to the other molybdenum disk. 

The conductivity cell (Extended Data Fig. 1b) was connected to a 1260 Solartron 
Impedance/Gain-Phase Analyzer for electrical impedance measurements. Redox 
conditions (oxygen fugacity) were not controlled during the experiments. 
However, the presence of a graphite furnace implies a reducing environment with 
an oxygen fugacity near the fayalite-magnetite—quartz buffer’’. 

To preserve the influence of deformation and chemistry on conductivity, heat- 
ing duration in the multi-anvil conductivity experiments did not exceed 2 h, and 
measurements were preferentially conducted at temperatures <900°C. As illu- 
strated by Extended Data Fig. 2c and f, melt-rich bands were preserved at 
temperatures slightly above the glass transition. This observation is consistent 
with previous experimental work on sheared melt-bearing materials, which 
demonstrated that annealing at about 1,100 °C and below causes essentially no 
melt redistribution over laboratory timescales'’**?. 

Electrical conductivity cell calibration. Our electrical cell has been tested using a 
hot-pressed polycrystalline olivine sample (wet Fo75) by performing o experi- 
ments under the same conditions (same sample, pressure and temperature) in 
two different laboratories (Arizona State University and Bayerisches GeolInstitut). 
The cell used at BGI was different from the one used at ASU, in terms of both 
materials and part dimensions. Electrical conductivity data were collected at 3 GPa 
and at temperatures up to 700 °C. All data from both laboratories fitted into a 3% 
error in loge, confirming that our cell was measuring the sample’s response 
without interference from any other part of the cell assembly. A short-circuit 
experiment performed at 4 GPa and at temperature up to 1,900 °C showed that 
our cell has a resistance of about 7 Q (ref. 33), which is negligible compared to the 
resistance of our samples (>3,000 Q). 

Electrical conductivity measurements. The complex impedance was collected 
during cooling over different temperature ranges (Extended Data Tables 3 and 4) 
in the frequency range 1 Hz to 5 MHz. The reproducibility of electrical measure- 
ments was validated by performing a few measurements during heating. For melt- 
free samples and melt-bearing samples at low temperature, the electrical response 
consisted of an impedance arc (Extended Data Fig. 1c). For MORB-bearing sam- 
ples at high temperature, the impedance arc was not observed and the electrical 
response consisted of a noisy part at high frequency and a vertical line at low 
frequency intersecting with the real axis of the complex plane. For all samples, the 
electrical resistance of the sample corresponds to the real part of the complex 
impedance, identified in the real versus imaginary component plot as the intercept 
of the spectrum with the real axis. 

Data reduction and uncertainties. For each sample, the electrical conductivity 
was calculated using the measured electrical resistance and sample geometric 
factor G (ref. 34) 


o=1/(RG) (1) 


where G= ar? /L, @ is the electrical conductivity (in units of S m—1), R is the 
electrical resistance (in ohms), r is the radius of the cylindrical sample (in metres), 
and L is the thickness of the sample (in metres). Relative errors on values of o 
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(Extended Data Tables 3 and 4) were calculated on the basis of errors on the 
geometrical factor (that is, considering errors on r and L) as well as propagated 
errors on each measured value of resistance R. As an example, uncertainties on 
electrical conductivity values are provided in Extended Data Table 5 for each 
temperature of the experiment on sample PT0683-2. 

Analytical techniques. All the recovered run products were mounted in epoxy resin 
and then ground and polished (using ethanol for the carbonate-bearing samples to 
preserve the carbonate phase, and using water for all the other samples) for analytical 
investigations. Scanning electron microscope imaging and electron microprobe ana- 
lyses were performed after each experiment to characterize the texture and chemistry 
of the samples. The chemical compositions of melt and mineral phases were obtained 
using a JEOL JXA-8900 Electron Probe Microanalyzer equipped with five wave- 
length-dispersive spectrometers. Electron microprobe analyses were conducted 
at 15 keV and 10 nA, with 10 s counting on peak elements and a beam size of 
100 um < 100 um. 

Electrical conductivity results. The dependence of electrical conductivity on 
temperature for each sample was fitted using an Arrhenius law 


og =09 X exp[—E,/(kT)| 


with oo being the pre-exponential factor (in units of S m_ '), Eq the activation 
energy (in electronvolts), k the Boltzmann constant (8.617 X 10 °eVK_!),and T 
the temperature (in kelvin). Values calculated for the pre-exponential factor and 
activation energy are presented in Extended Data Tables 3 and 4. 

Electrical conductivity layered model. In this electrical model, the material con- 
sidered alternating layers of sheared polycrystalline olivine and of olivine + 5 vol% 
MORB (layered model 1) or of basalt only (layered model 2). Electrical anisotropy 
can be modelled using series and parallel circuits, depending on the horizontal 
direction (x and y) of the electrical current. The vertical z direction is not considered 
since the magnetotelluric technique does not detect vertical anisotropy”’. 

Series circuit. The direction of the electrical current is perpendicular to the shear 
plane. The equivalent resistance of the circuit, R.g corresponds to the sum of the 
resistances of each layer, R; 


Req= DoRj (2) 


The resistance R; is calculated from values for o from our study (sheared samples of 
polycrystalline olivine and of olivine + MORB) and from ref. 22 (basalt) using the 
relationship 


Rest (3) 


0; X G; 
with Gi the geometric factor of layer i (surface area divided by thickness). The bulk 
equivalent electrical conductivity is obtained using 


1 
ea Reg XG () 


where G is the geometric factor for the sample. 

Application. We consider a cubic and layered material of size L = 1.5 mm (G = 
1.5 X 107° m) made of five layers, two of olivine and three of olivine + MORB or 
of basalt. The dimensions of each olivine layer are 1.5 mm X 1.5mm X 0.525 mm 
(geometric factor of 4.3 X 107° m), and the dimensions of each melt-bearing layer 
are 1.5mm X 1.5mm X 0.15 mm (geometric factor of 1.5 X 10 7m). At 1,200°C, 
the equivalent value of ¢ is 1.31 X 10 *Sm™! (logs = —2.88) for the material 
with basalt layers and 2.99 x 10 *Sm'’ (logs = —3.52) for the material with 
5 vol% MORB layers. At 1,300 °C, the equivalent value of o is 2.13 X 10 °Sm'! 
(logs = —2.67) for the material with basalt layers and 1.08 x 10°? S m7 
(logs = —2.97) for the material with 5 vol% MORB layers. 

Parallel circuit. In this case, the electrical current is parallel to the shear plane. The 
reciprocal of equivalent resistance equals the sum of the reciprocals of the resist- 
ance values 


1 1 
Ray 2 a 


The resistance R; is calculated from values of o measured in our study (undeformed 
polycrystalline olivine and olivine + MORB) and reported in ref. 22 (for basalt). 
The bulk equivalent electrical conductivity is obtained using the relationship 


; (6) 


0. =.._——_.. 
“tT Req X G 


Application. We consider the same cubic material as above. With this direction 
for the current flow, the geometric factor of each olivine layer is 5.25 X 10° *m 
and of each melt-bearing layer is 1.5 X 10°-* m. At 1,200°C, the equivalent 
value of g is 0.11Sm°! (logo = —0.96) for the material with basalt layers and 


659X 10 °Sm! (logo = —2.18) for the material with 5 vol% MORB layers. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


At 1,300 °C, the equivalent value of o is 0.44 Sm! (logo = —0.36) for the 
material with basalt layers and 1.40 x 10 *Sm? (log o = —1.85) for the material 
with 5 vol% MORB layers. 

Electrical anisotropy. Electrical anisotropy is the ratio of electrical conductivity 
for the parallel circuits to electrical conductivity for the series circuits (that is, 
Oparallel circuit/Oseries circuit): For the material containing 5 vol% MORB layers, elec- 
trical anisotropy is 22.0 (42.0) and 13.0 (+0.1) at 1,200 °C and 1,300 °C, respect- 
ively. For the material containing basalt layers, higher values for electrical 
anisotropy of 82.9 (+7.5) and 207 (+29) at 1,200 °C and 1,300 °C, respectively, 
are obtained. 
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Extended Data Figure 1 | Experimental protocol. a, Starting material 
preparation. Deformed material is synthesized in a gas-medium apparatus at 
300 MPa (left), and its outer part is extracted for electrical measurements in the 
multi-anvil apparatus at about 3 GPa (right). Two orientations of the sample 
are considered, leading to vertical (blue) and tangential (purple) electrical 
measurements. b, Cross-section of the electrical conductivity cell (14/8 multi- 
anvil assembly, that is, the corner-truncation edge length is 8 mm and the 
pressure media edge length is 14 mm). Both electrodes are made of W-Re 


thermocouple wire with one electrode also serving as a thermocouple. 

c, Example of a complex impedance spectrum (real part Z’ versus imaginary 
part Z’’) for a sheared sample of olivine + 5 vol% MORB sample at 750 °C and 
approximately 3 GPa. The intersection between the response of the sample 
(blue dots, each dot corresponding to one frequency) with the real axis 
corresponds to the electrical resistance of the sample. The corresponding 
electrical conductivity value is obtained using the geometric factor (the surface 
of the electrode divided by thickness of the sample). 
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Extended Data Figure 2 | Texture and melt geometry. a, Back-scattered 
electron image of sheared sample PT0683-2 after electrical measurements, 
showing that deformation-induced melt texture was preserved during the 
experiment in the multi-anvil apparatus. b, Back-scattered electron image of 
sheared sample PT0756-2 after electrical measurements, illustrating the 
location of melt amongst the olivine grains. c, Starting material PT0683 
showing the presence of melt-rich bands. d, Back-scattered electron image of 
sheared sample PT0683-1HT after electrical measurements at high 
temperature (up to 1,573 K). The absence of pronounced melt-rich bands 
suggests a loss in structural anisotropy, attributed to the effect of high 
temperature. These observations are consistent with electrical data that showed 
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100 um 


—— 
100 pm 


a noticeable decrease in electrical anisotropy with increasing temperature. 

e, Back-scattered electron image of sheared sample PT0683-1LT after electrical 
measurements at low temperature (<1,073 K), showing several light-coloured 
zones identified as melt-rich bands. The rectangle corresponds to the location 
of image f below. f, Zoom on melt-rich bands in sample PT0683-1LT. Melt is 
also present between the melt-rich zones as pockets amongst the olivine grains. 
g, Map of sodium distribution in sample PT0742-2 after electrical 
measurements. Colours correspond to the number of counts. Warm colours 
(reddish) correspond to high sodium concentration and are interpreted as 
pockets of carbonatite melt between olivine grains. 
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Extended Data Table 1 | Description of the starting materials. 


Sample* Composition Av. grain size Starting material synthesis 
(micron) 
Deformation P T Stress Shear Shear stress Shear strain rate 
type (GPa) (K) (GPa) strain (GPa) (10% s") 

Dry polycrystalline olivine 

PI1543-1 Foo9 16 compression 03 1523 0.44 - - - 

PI1543-2 Foo9 16 compression 03 1523 0.44 - - - 

PT0264-1 Fooo 4-10 torsion 03 1473 - 35 0.24 35) 

PT0264-2 Fooo 4-10 torsion 0.3 1473 - 35 0.24 3.5 
Dry polycrystalline olivine +MORB 

PT0683-1LT Fooo+5% MORB 15 torsion 0.3 1473 - 12 0.12-0.15 46 

PT0683-1HT Fooo+5% MORB 15 torsion 0.3 1473 - 12 0.12-0.15 46 

PT0683-2 Fo9+5% MORB 15 torsion 03 1473 - 12 0.12-0.15 46 

PTO705-2 Fo+5% MORB 15S torsion 0.3 1473 - 9 0.12-0.15 15 

PT0756-1 Foo9+5% MORB 15 torsion 0.3 1473 - 2.16 0.12-0.15 55 

PT0756-2 Foo9+5% MORB 15 torsion 0.3 1473 - 2.16 0.12-0.15 55 
Dry polycrystalline olivine + carbonate melt 

PT0742-1 Foo9+2%NaKCO; melt 5-10 torsion 0.3 1473 - 3 0.12-0.15 1.1 

PT0742-2 Foo9+2%NaKCOs melt 5-10 torsion 0.3 1473 - 3 0.12-0.15 1.1 

PT0742-3 Fooo+2%NaKCOs; melt 5-10 torsion 0.3 1473 - 3 0.12-0.15 Ll 


* Extension: Electrical conductivity ¢ measurements perpendicular to deformation (-1), parallel to deformation (-2) and undeformed (centre of column) (-3), respectively. LT, low temperature; HT, high 


temperature. 
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Extended Data Table 2 | Electron microprobe analyses of most of the run products 


Sample Phase | SiO, TiO, Al,O; Cr,03 FeOtot MnO MgO CaO. Na,O K,0 NiO Mg# | Sum 
PT0683-1LT ol(9)" | 40.7 16)" : - - 10.4 (1) O11 (2) 488 (32) - : - 0.35 (3) | 0.89 | 100.5 
gl()) 46.7 0.29 13.2 0.04 5.78 0.05 22.9 5.21 2.64 0.06 - 0.88 | 96.9 

PT0683-2 gl(1) 47.0 0.22 16.0 0.01 5.23 0.05 22.9 6.26 2.24 0.12 - 0.89 | 99.9 
PT0683-2bis | ol(17) | 40.9 (48) 9.67 di) 0.12 (2) 48.8 (31) - - - 0.37 (3) | 0.90 | 99.9 
gl(1) 47.4 1.62 12.6 0.02 6.82 0.11 21.4 7A 2.15 0.15 - 0.85 99.3 

PTO705-2 ol(17) | 40.9 (79) : - : 9.25 (14) O12 (3) 484 (23) - 5 2 037 (4) | 0.90] 991 
gl() 45.7 0.34 13.4 0.02 5.03 0.17 26.7 5.42 1.85 0.07 - 0.90 | 98.7 

PT0756-1 ol(12) | 40.9 (16) - - - 964 (13) 0.12 (2) 494 (29) - - - 0.35 (4) | 0.90 | 100.3 
PT0756-2 ol(10) | 41.0 29) - - - 10.0 (19) 0.14 (2) 48.0 (34) : : - 0.37 (2) | 0.90 | 99.5 
gl(3) 47.3 (9) 0.14 (10) 17.0 (24) 0.04 (1) 5.08 (17) 005 (1) 21.1 (68) 682 (24) 2.37 (19) 0.05 (1) - 0.88 | 100.0 

PT0742-1 ol(29) | 41.0 (13) - - - 8.44 (47) 0.12 (3) 50.5 (36) - - - 0.36 (4) | 0.91 | 100.3 
PT0742-3 ol(10) | 41.4 (30) : - - 8.48 (63) 011 (2) 501 52) - - - 0.31 (8) | 0.91 | 100.4 


Contents are in weight per cent. Ol, olivine; gl, glass. A dash indicates ‘not measured’. “The number of microprobe analyses is shown in parentheses. °One standard deviation in terms of least unit cited is shown in 
italics and parentheses; for example, 46.11(123) indicates a standard deviation of 1.23wt% on the value of 46.11wt%. 
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Extended Data Table 3 


Electrical results for polycrystalline olivine materials 


Sample Composition Electrical conductivity experiments Arrhenius equation parameters* 
P (GPa) Tad Duration (hr)* _Rel. error on 6 (%) Ea (eV) Error (eV) Ln(6), (6o in S/m) Error (S/m) 
PI1543-1 Fos0 2.8 1121-1575 2.0 7.5-10.5 0.959 0.005 0.56 0.003 
PI1543-2 F000 2.8 1274-1623 19 2.1-7.1 0.879 0.010 0.64 0.007 
PT0264-1 Foo0 2.8 873 0.5 2.2 - - - - 
PT0264-2 Foo0 2.8 1173-1570 1.0 2.0-11.5 1.042 0.001 4.1 0.003 


*Time spent in heating and cooling cycles. 


@ Ino = Inag — E2/kT. 
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Extended Data Table 4 | Electrical results for melt-bearing materials 


LETTER 


Sample Composition Electrical conductivity experiments Arrhenius equation parameters at Arrhenius equation parameters at Extrapolation of low-T data 
T<~870°C ¢ T>~870°C + to I>Tg" 
P T Duration Rel. error Ea Error Ln(o), Error Ea Error Ln(o), Error Slope Intercept 
(GPa) (kK) (hr)* ono (%) (eV) (eV) (6) in S/m) _(S/m) (eV) (eV) (Gin S/m) _(S/m) a b R’ 
Dry polycrystalline olivine + MORB 
PT0683-1LT Foo0+5% MORB 2.8 847-1,073 1.0 0.8-1.6 1.047 0.030 3.5 0.100 - - - - -0.571 1.942 0.967 
PT0683-1HT Foo9+5% MORB 2.8 1,395-1,466 1 4.1-5.1 - - - - 1.223 0.047 6.0 0.667 - - - 
PT0683-2 Fo99+5% MORB 2.8 825-1,479 13 2.5-10.7 0.335 0.003 -3.6 0.028 0.691 0.010 0.14 0.002 - - - 
PTO705-2 Fooo+5% MORB 2.8 822-1,076 1.0 1.6-3.1 1.109 0.010 5.1 0.480 - - - - -0.542 2.023 0.984 
PT0756-1 Foo9+10% MORB 2.8 725-1,120 2.5 5.0-5.7 0.942 0.004 1.3 0.005 - - - - -0.499 0.775 0.995 
PT0756-2 Foo9+10% MORB 2.8 671-1,078 12 6.4-6.8 0.728 0.022 0.55 0.117 : : : : -0.178 1.523 0.904 
Dry polycrystalline olivine + carbonate melt 
PT0742-1 Foo9+2% NaKCO; melt 2.8 1,023-1,473 2.0 1.6-4.2 0.381 0.018 -3.3 0.160 1.254 0.010 5.3 0.043 - - - 
PT0742-3 Foo9+2% NaKCO; melt 2.8 968-1,576 1.6 1.6-4.2 0.509 0.032 “3.1 0.196 1.358 0.013 5.6 0.053 - - - 


*Time spent in heating and cooling cycles. 


¢@ Ino = Ingo — E,/kT. 
Mlogoo = a X 10,000/T + b. 
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Extended Data Table 5 | Uncertainties on electrical conductivity values for the experiment on sample PT0683-2 


Electrical Error on 
Temperature Resistance conductivity conductivity Upper value Lower value 

(K) (ohm) (S/m) (%) (S/m) (S/m) 

825 1800000 0.00027 2.5 0.00028 0.00026 
848 1680000 0.00029 2.5 0.00030 0.00028 
878 1500000 0.00032 2.6 0.00034 0.00031 
899 1320000 0.00037 2.8 0.00039 0.00035 
922 1200000 0.00041 2.9 0.00043 0.00038 
945 1100000 0.00044 3.0 0.00047 0.00042 
975 1000000 0.00049 3.1 0.00052 0.00046 
1001 860000 0.00057 3.4 0.00060 0.00053 
1024 780000 0.00062 3.6 0.00067 0.00058 
1050 700000 0.00070 3.8 0.00075 0.00064 
1078 640000 0.00076 4.0 0.00082 0.00070 
1098 590000 0.00083 3.3 0.00088 0.00077 
1126 520000 0.00094 3.6 0.00100 0.00087 
1144 480000 0.00101 3.7 0.00109 0.00094 
1176 420000 0.00116 4.0 0.00125 0.00107 
1196 360000 0.00135 44 0.00147 0.00123 
1221 294000 0.00166 5.0 0.00182 0.00149 
1245 260000 0.00187 5.5 0.00208 0.00167 
1277 220000 0.00221 6.2 0.00249 0.00194 
1321 172000 0.00283 75 0.00325 0.00241 
1378 136000 0.00358 9.0 0.00423 0.00294 
1422 116000 0.00420 10.3 0.00506 0.00334 
1479 110000 0.00443 10.7 0.00538 0.00348 
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We generated genome-wide data from 69 Europeans who lived 
between 8,000-3,000 years ago by enriching ancient DNA libraries 
for a target set of almost 400,000 polymorphisms. Enrichment of 
these positions decreases the sequencing required for genome-wide 
ancient DNA analysis by a median of around 250-fold, allowing us 
to study an order of magnitude more individuals than previous 
studies’ * and to obtain new insights about the past. We show that 
the populations of Western and Far Eastern Europe followed opposite 
trajectories between 8,000-5,000 years ago. At the beginning of the 
Neolithic period in Europe, ~8,000-7,000 years ago, closely related 
groups of early farmers appeared in Germany, Hungary and Spain, 
different from indigenous hunter-gatherers, whereas Russia was inhab- 
ited by a distinctive population of hunter-gatherers with high affinity 
to a ~24,000-year-old Siberian®. By ~6,000-5,000 years ago, farmers 
throughout much of Europe had more hunter-gatherer ancestry than 
their predecessors, but in Russia, the Yamnaya steppe herders of this 
time were descended not only from the preceding eastern European 
hunter-gatherers, but also from a population of Near Eastern ances- 
try. Western and Eastern Europe came into contact ~4,500 years ago, 
as the Late Neolithic Corded Ware people from Germany traced 
~75% of their ancestry to the Yamnaya, documenting a massive 
migration into the heartland of Europe from its eastern periphery. 
This steppe ancestry persisted in all sampled central Europeans until 
at least ~3,000 years ago, and is ubiquitous in present-day Europeans. 
These results provide support for a steppe origin’ of at least some of 
the Indo-European languages of Europe. 

Genome-wide analysis of ancient DNA has emerged as a transform- 
ative technology for studying prehistory, providing information that is 
comparable in power to archaeology and linguistics. Realizing its pro- 
mise, however, requires collecting genome-wide data from an adequate 
number of individuals to characterize population changes over time, 
which means not only sampling a succession of archaeological cultures’, 
but also multiple individuals per culture. To make analysis of large num- 
bers of ancient DNA samples practical, we used in-solution hybridiza- 
tion capture’®”’ to enrich next generation sequencing libraries for a 


target set of 394,577 single nucleotide polymorphisms (SNPs) (390k 
capture’), 354,212 of which are autosomal SNPs that have also been 
genotyped using the Affymetrix Human Origins array in 2,345 humans 
from 203 populations*’’. This reduces the amount of sequencing re- 
quired to obtain genome-wide data by a minimum of 45-fold and a 
median of 262-fold (Supplementary Data 1). This strategy allows us to 
report genomic scale data on more than twice the number of ancient 
Eurasians as has been presented in the entire preceding literature’* 
(Extended Data Table 1). 

We used this technology to study population transformations in Europe. 
We began by preparing 212 DNA libraries from 119 ancient samples in 
dedicated clean rooms, and testing these by light shotgun sequencing 
and mitochondrial genome capture (Supplementary Information sec- 
tion 1, Supplementary Data 1). We restricted the analysis to libraries 
with molecular signatures of authentic ancient DNA (elevated damage 
in the terminal nucleotide), negligible evidence of contamination based 
on mismatches to the mitochondrial consensus** and, where available, 
a mitochondrial DNA haplogroup that matched previous results using 
PCR*!*"> (Supplementary Information section 2). For 123 libraries 
prepared in the presence of uracil-DNA-glycosylase’® to reduce errors 
due to ancient DNA damage”, we performed 390k capture, carried out 
paired-end sequencing and mapped the data to the human genome. 
We restricted analysis to 94 libraries from 69 samples that had at least 
0.06-fold average target coverage (average of 3.8-fold) and used major- 
ity rule to call an allele at each SNP covered at least once (Supplemen- 
tary Data 1). After combining our data (Supplementary Information 
section 3) with 25 ancient samples from the literature — three Upper 
Paleolithic samples from Russia’®’, seven people of European hunter- 
gatherer ancestry~*>*, and fifteen European farmers**** — we had data 
from 94 ancient Europeans. Geographically, these came from Germany 
(n = 41), Spain (n = 10), Russia (n = 14), Sweden (n = 12), Hungary 
(n = 15), Italy (n = 1) and Luxembourg (n = 1) (Extended Data Table 2). 
Following the central European chronology, these included 19 hunter- 
gatherers (~43,000-2,600 Bc), 28 Early Neolithic farmers (~6,000- 
4,000 Bc), 11 Middle Neolithic farmers (~4,000-3,000 Bc) including 


1 Australian Centre for Ancient DNA, School of Earth and Environmental Sciences & Environment Institute, University of Adelaide, Adelaide, South Australia 5005, Australia. 2Department of Genetics, Harvard 
Medical School, Boston, Massachusetts 02115, USA. 3Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. “Howard Hughes Medical Institute, Harvard Medical School, Boston, 
Massachusetts 02115, USA. °Institute of Anthropology, Johannes Gutenberg University of Mainz, D-55128 Mainz, Germany. °Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, 

Germany. ’Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, IVPP, CAS, Beijing 100049, China. ®Institute for Archaeological Sciences, University of Tubingen, 


*These authors contributed equally to this work. 


D-72070 Tiibingen, Germany. “Institute of Archaeology, Research Centre for the Humanities, Hungarian Academy of Science, H-1014 Budapest, Hungary. !CRémisch Germanische Kommission (RGK) 
Frankfurt, D-60325 Frankfurt, Germany. !+Archaeological Research Laboratory, Stockholm University, 114 18 Stockholm, Sweden. *Departments of Paleoanthropology and Archaeogenetics, 
Senckenberg Center for Human Evolution and Paleoenvironment, University of Tiibingen, D-72070 Tubingen, Germany. !*State Office for Heritage Management and Archaeology Saxony-Anhalt and State 
Museum of Prehistory, D-06114 Halle, Germany. ‘*Departamento de Prehistoria y Arqueologia, Facultad de Filosofia y Letras, Universidad Auténoma de Madrid, E-28049 Madrid, Spain. }°The Cultural 
Heritage Foundation, Vasteras 722 12, Sweden. !°Peter the Great Museum of Anthropology and Ethnography (Kunstkamera) RAS, St Petersburg 199034, Russia. !’Volga State Academy of Social Sciences 
and Humanities, Samara 443099, Russia. ‘®Deutsches Archaeologisches Institut, Abteilung Madrid, E-28002 Madrid, Spain. ‘?Danube Private University, A-3500 Krems, Austria. “Institute for Prehistory 
and Archaeological Science, University of Basel, CH-4003 Basel, Switzerland. 21Departamento de Prehistoria, Universitat Autonoma de Barcelona, E-08193 Barcelona, Spain. 22Departamento de 
Prehistoria y Arqueolgia, Universidad de Valladolid, E-47002 Valladolid, Spain. *°State Office for Cultural Heritage Management Baden-Wtirttemberg, Osteology, D-78467 Konstanz, Germany. °*Max Planck 
nstitute for the Science of Human History, D-07745 Jena, Germany. @°Anthropology Department, Hartwick College, Oneonta, New York 13820, USA. 


11 JUNE 2015 | VOL 522 | NATURE | 207 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


the Tyrolean Iceman’, 9 Late Copper/Early Bronze Age individuals 
(Yamnaya: ~3,300-2,700 Bc), 15 Late Neolithic individuals (~2,500- 
2,200 BC), 9 Early Bronze Age individuals (~2,200-1,500 Bc), two Late 
Bronze Age individuals (~1,200-1,100 Bc) and one Iron Age indivi- 
dual (~900 Bc). Two individuals were excluded from analyses as they 
were related to others from the same population. The average number of 
SNPs covered at least once was 212,375 and the minimum was 22,869 
(Fig. 1). 

We determined that 34 of the 69 newly analysed individuals were 
male and used 2,258 Y chromosome SNPs targets included in the cap- 
ture to obtain high resolution Y chromosome haplogroup calls (Sup- 
plementary Information section 4). Outside Russia, and before the Late 
Neolithic period, only a single R1b individual was found (early Neolithic 
Spain) in the combined literature (n = 70). By contrast, haplogroups 
Rlaand R1b were found in 60% of Late Neolithic/Bronze Age Europeans 
outside Russia (n = 10), and in 100% of the samples from European 
Russia from all periods (7,500-2,700 Bc; n = 9). Rla and R1b are the 
most common haplogroups in many European populations today'*””, 
and our results suggest that they spread into Europe from the East after 
3,000 Bc. Two hunter-gatherers from Russia included in our study be- 
longed to Rla (Karelia) and R1b (Samara), the earliest documented ancient 
samples of either haplogroup discovered to date. These two hunter- 
gatherers did not belong to the derived lineages M417 within Rla and 
M269 within R1b that are predominant in Europeans today’*”’, but all 
7 Yamnaya males did belong to the M269 subclade'* ofhaplogroup R1b. 

Principal components analysis (PCA) of all ancient individuals along 
with 777 present-day West Eurasians* (Fig. 2a, Supplementary Infor- 
mation section 5) replicates the positioning of present-day Europeans 
between the Near East and European hunter-gatherers*”, and the clus- 
tering of early farmers from across Europe with present day Sardinians**, 
suggesting that farming expansions across the Mediterranean to Spain 
and via the Danubian route to Hungary and Germany descended from 
a common stock. By adding samples from later periods and additional 
locations, we also observe several new patterns. All samples from Russia 
have affinity to the ~24,000-year-old MA] (ref. 6), the type specimen for 


the Ancient North Eurasians (ANE) who contributed to both Europeans* 
and Native Americans*®*. The two hunter-gatherers from Russia (Karelia 
in the northwest of the country and Samara on the steppe near the Urals) 
form an ‘eastern European hunter-gatherer’ (EHG) cluster at one end of 
a hunter-gatherer cline across Europe; people of hunter-gatherer ances- 
try from Luxembourg, Spain, and Hungary sit at the opposite ‘western 
European hunter-gatherer’* (WHG) end, while the hunter-gatherers 
from Sweden** (SHG) are intermediate. Against this background of dif- 
ferentiated European hunter-gatherers and homogeneous early farmers, 
multiple population turnovers transpired in all parts of Europe included 
in our study. Middle Neolithic Europeans from Germany, Spain, Hungary, 
and Sweden from the period ~4,000-3,000 Bc are intermediate between 
the earlier farmers and the WHG, suggesting an increase of WHG ances- 
try throughout much of Europe. By contrast, in Russia, the later Yamnaya 
steppe herders of ~3,000 Bc plot between the EHG and the present-day 
Near East/Caucasus, suggesting a decrease of EHG ancestry during the 
same time period. The Late Neolithic and Bronze Age samples from 
Germany and Hungary’ are distinct from the preceding Middle Neo- 
lithic and plot between them and the Yamnaya. This pattern is also 
seen in ADMIXTURE analysis (Fig. 2b, Supplementary Information 
section 6), which implies that the Yamnaya have ancestry from popu- 
lations related to the Caucasus and South Asia that is largely absent in 
38 Early or Middle Neolithic farmers but present in all 25 Late Neo- 
lithic or Bronze Age individuals. This ancestry appears in Central 
Europe for the first time in our series with the Corded Ware around 
2,500 Bc (Supplementary Information section 6, Fig. 2b). The Corded 
Ware shared elements of material culture with steppe groups such as 
the Yamnaya although whether this reflects movements of people has 
been contentious”’. Our genetic data provide direct evidence of migra- 
tion and suggest that it was relatively sudden. The Corded Ware are 
genetically closest to the Yamnaya ~2,600 km away, as inferred both 
from PCA and ADMIXTURE (Fig. 2) and Fsy (0.011 + 0.002) (Extended 
Data Table 3). If continuous gene flow from the east, rather than migra- 
tion, had occurred, we would expect successive cultures in Europe 
to become increasingly differentiated from the Middle Neolithic, but 
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Figure 1 | Location and SNP coverage of samples included in this study. 
a, Geographic location and time-scale (central European chronology) of the 69 
newly analysed ancient individuals from this study (black outline) and 25 from 
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the literature for which shotgun sequencing data was available (no outline). 
b, Number of SNPs covered at least once in the analysis data set of 94 
individuals. 


©2015 Macmillan Publishers Limited. All rights reserved 


a 
==} 
Ancient North Eurasians (ANE) 
0.104 
Fa) a 
co 
° 
Eastern European hunter-gatherers (EHG) y. a 
} amnaya 
Dilution of EHG a 
0.05 4 a 
~5,000-3,000 8c 
oe > 
@gorded Ware 
% a 
pe {e} 
Ss aree 
2 0.00 4 8 1 ate Neolithic. Bronze Age (LN/BA) 
® Bago ro) 
£ ° 24 S @ 
a ° o2\e 
od S\o eq @ 
2 ° S\s "@ 
* v\3 © 
Scandinavian hunter-gatherers (SHG) 3 2 
3\3 x 
a Vv 
3 
-0.05 5 2 A ky w 
A 
¢-* W Ne 
Western European hunter-gatherers (WHG) Middle Neolitfte (MN) ry Neolithic (EN) 
Vv 
a\ Ww WY 
A 
“A \ a 
-0.1074 
WHG replaced by early European farmersV 
> 
>5,500 Bc Resurgence of WHG 
<—____. 
~5,000-3,000 sc 


LETTER 
7 


b Baalberge_MN 

Esperstedt_VN 

Spain_MN 

SwedenSkoglund_MN lets 
Iceman 

Spain_EN 


HungaryGamba_CA | 
tarcevo_EN Sirmmm 
& Ust_Ishim Stuttqa't Xx 
& Kostenki14 

fH MA1 HungaryGamba_EN 
©} Karelia_HG 

© Samara_HG 

© Motala_HG 

@ SwedenSkoglund_MHG 
@ SwedenSkoglund_NHG 
@ Loschbour 

® LaBrana1 


LBKT_EN Smt 
@ HungaryGamba_HG farelia He es 5 
a a 
V Starcevo_EN SwedenSkogiund_MHG (1 
bd aes en SwedenSkoglund_NHG a 
jungaryGamba__ 
a 
y LBK EN LaBrana1 J 
y Stuttgart Motala_HG 
Vv Spain_EN 
& Iceman HungaryGamba_HG 
A Spain_MN Loschou 


A Baalberge_MN 

A Esperstedt_MN 

A SwedenSkoglund_MN 
BG Yamnaya 

A HungaryGamba_CA 
@ Corded_Ware_LN 

@ Karsdorf_LN 

@ Bell_Beaker_LN 

© BenzigerodeHeimburg_LN 
@ Alberstedt_LN 

O Unetice_EBA 

@ HungaryGamba_BA 

@ Halberstadt_LBA 

™@ HungaryGamba_lA 


BenzigerodeHeimburg_LN 
Unetice_EBA 


HungaryGamba_BA (0 
Bell_Beaker_LN 


Alberstedt_LN 
Halberstadt_LBA 
HungaryGamba_lA 


Yamnaya 


Corded_Ware_LN 


Karscor_LN 
Ust_lshin = 


T T 
-0.05 0.00 
Dimension 1 


Kostenki14 0 i 


Figure 2 | Population transformations in Europe. a, PCA analysis. b, ADMIXTURE analysis. The full ADMIXTURE analysis including present-day humans is 


shown in Supplementary Information section 6. 


instead, the Corded Ware are both the earliest and most strongly dif- 
ferentiated from the Middle Neolithic population. 

‘Outgroup’ f; statistics® (Supplementary Information section 7), which 
measure shared genetic drift between a pair of populations (Extended 
Data Fig. 1), support the clustering of hunter-gatherers, Early/Middle 
Neolithic, and Late Neolithic/Bronze Age populations into different 
groups as in the PCA (Fig. 2a). We also analysed f, statistics, which allow 
us to test whether pairs of populations are consistent with descent from 
common ancestral populations, and to assess significance using a nor- 
mally distributed Z score. Early European farmers from the Early and 
Middle Neolithic were closely related but not identical. This is reflected 
in the fact that Loschbour, a WHG individual from Luxembourg* shared 
more alleles with post-4,000 Bc European farmers from Germany, Spain, 
Hungary, Sweden and Italy than with early farmers of Germany, Spain, 
and Hungary, documenting an increase of hunter-gatherer ancestry in 
multiple regions of Europe during the course of the Neolithic. The two 
EHG form a clade with respect to all other present-day and ancient popu- 
lations (|Z| < 1.9), and MA1 shares more alleles with them (|Z| > 4.7) 
than with other ancient or modern populations, suggesting that they 
may bea source for the ANE ancestry in present Europeans*’*”* as they 
are geographically and temporally more proximate than Upper Paleolithic 
Siberians. The Yamnaya differ from the EHG by sharing fewer alleles 
with MA1 (|Z| = 6.7) suggesting a dilution of ANE ancestry between 
5,000-3,000 Bc on the European steppe. This was likely due to admixture 
of EHG with a population related to present-day Near Easterners, as the 
most negative f; statistic in the Yamnaya (giving unambiguous evidence 
of admixture) is observed when we model them as a mixture of EHG 
and present-day Near Eastern populations like Armenians (Z = —6.3; 


Supplementary Information section 7). The Late Neolithic/Bronze Age 
groups of central Europe share more alleles with Yamnaya than the 
Middle Neolithic populations do (|Z| = 12.4) and more alleles with the 
Middle Neolithic than the Yamnaya do (|Z| = 12.5), and have a nega- 
tive f, statistic with the Middle Neolithic and Yamnaya as references 
(Z = —20.7), indicating that they were descended from a mixture of 
the local European populations and new migrants from the east. More- 
over, the Yamnaya share more alleles with the Corded Ware (|Z| = 3.6) 
than with any other Late Neolithic/Early Bronze Age group with at least 
two individuals (Supplementary Information section 7), indicating that 
they had more eastern ancestry, consistent with the PCA and ADMIXTURE 
patterns (Fig. 2). 

Modelling of the ancient samples shows that while Karelia is gen- 
etically intermediate between Loschbour and MA1, the topology that 
considers Karelia as a mixture of these two elements is not the only one 
that can fit the data (Supplementary Information section 8). To avoid 
biasing our inferences by fitting an incorrect model, we developed new 
statistical methods that are substantial extensions ofa previously reported 
approach’, which allow us to obtain precise estimates of the proportion 
of mixture in later Europeans without requiring a formal model for the 
relationship among the ancestral populations. The method (Supplemen- 
tary Information section 9) is based on the idea that if a Test population 
has ancestry related to reference populations Ref), Ref;, ..., Refy in 
proportions 0, 2, .... &x, and the references are themselves differenti- 
ally related to a triple of outgroup populations A, B, C, then: 


N 
fa(Test,A; B,C) = >> ajfa(Ref;,A; B,C) 


i=1 
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By using a large number of outgroup populations we can fit the admix- 
ture coefficients «; and estimate mixture proportions (Supplementary 
Information section 9, Extended Data Fig. 2). Using 15 outgroups 
from Africa, Asia, Oceania and the Americas, we obtain good fits as 
assessed by a formal test (Supplementary Information section 10), and 
estimate that the Middle Neolithic populations of Germany and Spain 
have ~18-34% more WHG-related ancestry than Early Neolithic 
populations and that the Late Neolithic and Early Bronze Age popula- 
tions of Germany have ~22-39% more EHG-related ancestry than the 
Middle Neolithic ones (Supplementary Information section 9). If we 
model them as mixtures of Yamnaya-related and Middle Neolithic 
populations, the inferred degree of population turnover is doubled to 
48-80% (Supplementary Information sections 9 and 10). 

To distinguish whether a Yamnaya or an EHG source fits the data 
better, we added ancient samples as outgroups (Supplementary Infor- 
mation section 9). Adding any Early or Middle Neolithic farmer results 
in EHG-related genetic input into Late Neolithic populations being a 
poor fit to the data (Supplementary Information section 9); thus, Late 
Neolithic populations have ancestry that cannot be explained by a mix- 
ture of EHG and Middle Neolithic. When using Yamnaya instead of 
EHG, however, we obtain a good fit (Supplementary Information sec- 
tions 9 and 10). These results can be explained if the new genetic material 
that arrived in Germany was a composite of two elements: EHG and a 
type of Near Eastern ancestry different from that which was introduced 
by early farmers (also suggested by PCA and ADMIXTURE; Fig. 2, Sup- 
plementary Information sections 5 and 6). We estimate that these two 
elements each contributed about half the ancestry of the Yamnaya 
(Supplementary Information sections 6 and 9), explaining why the 
population turnover inferred using Yamnaya as a source is about twice 
as high compared to the undiluted EHG. The estimate of Yamnaya- 
related ancestry in the Corded Ware is consistent when using either 
present populations or ancient Europeans as outgroups (Supplemen- 
tary Information sections 9 and 10), and is 73.1 + 2.2% when both sets 
are combined (Supplementary Information section 10). The best pro- 
xies for ANE ancestry in Europe* were initially Native Americans’*”’, 
and then the Siberian MA] (ref. 6), but both are geographically and 
temporally too remote for what appears to be a recent migration into 
Europe’. We can now add three new pieces to the puzzle of how ANE 
ancestry was transmitted to Europe: first by the EHG, then the Yamnaya 
formed by mixture between EHG and a Near Eastern related popu- 
lation, and then the Corded Ware who were formed by a mixture of the 
Yamnaya with Middle Neolithic Europeans. We caution that the sampled 
Yamnaya individuals from Samara might not be directly ancestral to 
Corded Ware individuals from Germany. It is possible that a more 
western Yamnaya population, or an earlier (pre- Yamnaya) steppe popu- 
lation may have migrated into central Europe, and future work may 
uncover more missing links in the chain of transmission of steppe ancestry. 

By extending our model to a three-way mixture of WHG, Early Neolithic 
and Yamnaya, we estimate that the ancestry of the Corded Ware was 
79% Yamnaya-like, 4% WHG, and 17% Early Neolithic (Fig. 3). A small 
contribution of the first farmers is also consistent with uniparentally 
inherited DNA: for example, mitochondrial DNA haplogroup Nla and 
Y chromosome haplogroup G2a, common in early central European 
farmers'*”*, almost disappear during the Late Neolithic and Bronze 
Age, when they are largely replaced by Y haplogroups Rla and R1b (Sup- 
plementary Information section 4) and mtDNA haplogroups I, T1, U2, U4, 
U5a, W, and subtypes of H'**** (Supplementary Information section 2). 
The uniparental data not only confirm a link to the steppe populations 
but also suggest that both sexes participated in the migrations (Sup- 
plementary Information sections 2 and 4 and Extended Data Table 2). 
The magnitude of the population turnover that occurred becomes even 
more evident if one considers the fact that the steppe migrants may well 
have mixed with eastern European agriculturalists on their way to cen- 
tral Europe. Thus, we cannot exclude a scenario in which the Corded 
Ware arriving in today’s Germany had no ancestry at all from local 
populations. 
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Figure 3 | Admixture proportions. We estimate mixture proportions 
using a method that gives unbiased estimates even without an accurate 
model for the relationships between the test populations and the outgroup 
populations (Supplementary Information section 9). Population samples 
are grouped according to chronology (ancient) and Yamnaya ancestry 
(present-day humans). 


Our results support a view of European pre-history punctuated by 
two major migrations: first, the arrival of the first farmers during the 
Early Neolithic from the Near East, and second, the arrival of Yamnaya 
pastoralists during the Late Neolithic from the steppe. Our data further 
show that both migrations were followed by resurgences of the previous 
inhabitants: first, during the Middle Neolithic, when hunter-gatherer 
ancestry rose again after its Early Neolithic decline, and then between 
the Late Neolithic and the present, when farmer and hunter-gatherer 
ancestry rose after its Late Neolithic decline. This second resurgence 
must have started during the Late Neolithic/Bronze Age period itself, 
as the Bell Beaker and Unetice groups had reduced Yamnaya ancestry 
compared to the earlier Corded Ware, and comparable levels to that in 
some present-day Europeans (Fig. 3). Today, Yamnaya related ances- 
try is lower in southern Europe and higher in northern Europe, and all 
European populations can be modelled as a three-way mixture of WHG, 
Early Neolithic, and Yamnaya, whereas some outlier populations show 
evidence for additional admixture with populations from Siberia and 
the Near East (Extended Data Fig. 3, Supplementary Information sec- 
tion 9). Further data are needed to determine whether the steppe ances- 
try arrived in southern Europe at the time of the Late Neolithic/Bronze 
Age, or is due to migrations in later times from northern Europe”*”*. 

Our results provide new data relevant to debates on the origin and 
expansion of Indo-European languages in Europe (Supplementary Infor- 
mation section 11). Although the findings from ancient DNA are silent 
on the question of the languages spoken by preliterate populations, 
they do carry evidence about processes of migration which are invoked 
by theories on Indo-European language dispersals. Such theories make 
predictions about movements of people to account for the spread of 
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languages and material culture (Extended Data Fig. 4). The technology 
of ancient DNA makes it possible to reject or confirm the proposed 
migratory movements, as well as to identify new movements that 
were not previously known. The best argument for the “Anatolian 
hypothesis”’ that Indo-European languages arrived in Europe from 
Anatolia ~8,500 years ago is that major language replacements are 
thought to require major migrations, and that after the Early Neolithic 
when farmers established themselves in Europe, the population base 
was likely to have been so large that later migrations would not have 
made much of an impact””’*. However, our study shows that a later 
major turnover did occur, and that steppe migrants replaced ~75% of 
the ancestry of central Europeans. An alternative theory is the ‘steppe 
hypothesis’, which proposes that early Indo-European speakers were 
pastoralists of the grasslands north of the Black and Caspian Seas, and 
that their languages spread into Europe after the invention of wheeled 
vehicles’. Our results make a compelling case for the steppe as a source 
of at least some of the Indo-European languages in Europe by doc- 
umenting a massive migration ~4,500 years ago associated with the 
Yamnaya and Corded Ware cultures, which are identified by proponents 
of the steppe hypothesis as vectors for the spread of Indo-European 
languages into Europe. These results challenge the Anatolian hypothesis 
by showing that not all Indo-European languages in Europe can plaus- 
ibly derive from the first farmer migrations thousands of years earlier 
(Supplementary Information section 11). We caution that the location 
of the proto-Indo-European’”””**” homeland that also gave rise to the 
Indo-European languages of Asia, as well as the Indo-European lan- 
guages of southeastern Europe, cannot be determined from the data 
reported here (Supplementary Information section 11). Studying the 
mixture in the Yamnaya themselves, and understanding the genetic 
relationships among a broader set of ancient and present-day Indo- 
European speakers, may lead to new insight about the shared homeland. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Screening of libraries (shotgun sequencing and mitochondrial capture). The 
212 libraries screened in this study (Supplementary Information section 1) from 
a total of 119 samples (Supplementary Information section 3) were produced 
at Adelaide (n = 151), Tiibingen (n = 16), and Boston (n = 45) (Supplementary 
Data 1). 

The libraries from Adelaide and Boston had internal barcodes directly attached 
to both sides of the molecules from the DNA extract so that each sequence begins 
with the barcode’. The Adelaide libraries had 5 base pair (bp) barcodes on both 
sides, while the Boston libraries had 7 bp barcodes. Libraries from Tiibingen had no 
internal barcodes, but were differentiated by the sequence of the indexing primer’. 

We adapted a reported protocol for enriching for mitochondrial DNA”, with 
the difference that we adjusted the blocking oligonucleotides and PCR primers to 
fit our libraries with shorter adapters. Over the course of this project, we also lowered 
the hybridization temperature from 65 °C to 60° C and performed stringent washes 
at 55 °C instead of 60°C”. 

We used an aliquot of approximately 500 ng of each library for target enrich- 
ment of the complete mitochondrial genome in two consecutive rounds”, using a 
bait set for human mtDNA”. We performed enrichment in 96-well plates with one 
library per well, and used a liquid handler (Evolution P3, Perkin Elmer) for the 
capture and washing steps*’. We used blocking oligonucleotides in hybridization 
appropriate to the adapters of the truncated libraries. After either of the two enrich- 
ment rounds, we amplified the enriched library molecules with the primer pair that 
keeps the adapters short (PreHyb) using Herculase Fusion II PCR Polymerase. We 
performed an indexing PCR of the final capture product using one or two indexing 
primers*’. We cleaned up all PCR reactions using SPRI technology” and the liquid 
handler. Libraries from Tiibingen were amplified with the primer pair IS5/IS6°’. 

For libraries from Boston and Adelaide, we used a second aliquot of each library 
for shotgun sequencing after performing an indexing PCR*’. We used unique 
index combinations for each library and experiment, allowing us to distinguish 
shotgun sequencing and mitochondrial DNA capture data, even if both experiments 
were in the same sequencing run. We sequenced shotgun libraries and mtDNA 
captured libraries from Tiibingen in independent sequencing runs since the index 
was already attached at the library preparation stage. 

We quantified the sequencing pool with the BioAnalyzer (Agilent) and/or the 

KAPA Library Quantification kit (KAPA Biosystems) and sequenced on Illumina 
MiSeq, HiSeq2500 or NextSeq500 instruments for 2 X 75, 2 X 100 or 2 X 150 
cycles along with the indexing read(s). 
Enrichment for 394,577 SNP targets (‘390k capture’). The protocol for enrich- 
ment for SNP targets was similar to the mitochondrial DNA capture, with the 
exception that we used another bait set (390k) and about twice as much library (up 
to 1,000 ng) compared to the mtDNA capture. 

The specific capture reagent used in this study is described for the first time here. 
To target each SNP, we used a different oligonucleotide probe design compared to 
ref. 10. We used four 52 base pair probes for each SNP target. One probe ends just 
before the SNP, and one starts just after. Two probes are centred on the SNP, and 
are identical except for having the alternate alleles. This probe design avoids 
systematic bias towards one SNP allele or another. For the template sequence for 
designing the San and Yoruba panels baits, we used the sequence that was sub- 
mitted for these same SNPs during the design of the Affymetrix Human Origins 
SNP array’*. For SNPs that were both in the San and Yoruba panels, we used the 
Yoruba template sequence in preference. For all other SNPs, we used the human 
genome reference sequence as a template. Supplementary Data 2a-d gives the list 
of SNPs that we targeted, along with details of the probes used. The breakdown of 
SNPs into different classes is as follows. 

124,106 ‘Yoruba SNPs’: all SNPs in ‘panel 5’ of the Affymetrix Human Origins 
array (discovered as heterozygous in a Yoruba male: HGDP00927)”* that passed 
the probe design criteria specified in ref. 11. 

146,135 ‘San SNPs’: all SNPs in ‘panel 4’ of the Affymetrix Human Origins array 
(discovered as heterozygous in a San male: HGDP01029)” that passed probe 
design criteria’. The full Affymetrix Human Origins array panel 4 contains several 
tens of thousands of additional SNPs overlapping those from panel 5, but we did 
not wish to redundantly capture panel 5 SNPs. 

98,166 ‘compatibility SNPs’: SNPs that overlap between the Affymetrix Human 
Origins, the Affymetrix 6.0, and the Illumina 610 Quad arrays, which are not already 
included in the “Yoruba SNPs’ or ‘San SNPs’ lists’? and that also passed the probe 
design design criteria'’. 

26,170 ‘miscellaneous SNPs’: SNPs that did not overlap the Human Origins 
array. The subset analysed in this study were 2,258 Y chromosome SNPs (http:// 
isogg.org/tree/ISOGG_YDNA_SNP_Index.html) that we used for Y haplogroup 
determination. 

Processing of sequencing data. We restricted analysis to read pairs that passed 
quality control according to the Illumina software (‘PF reads’). 


We assigned read pairs to libraries by searching for matches to the expected 
index and barcode sequences (if present, as for the Adelaide and Boston libraries). 
We allowed no more than 1 mismatch per index or barcode, and zero mismatches 
if there was ambiguity in sequence assignment or if barcodes of 5 bp length were 
used (Adelaide libraries). 

We used Seqprep (https://github.com/jstjohn/SeqPrep) to search for overlap- 
ping sequence between the forward and reverse read, and restricted to molecules 
where we could identify a minimum of 15 bp of overlap. We collapsed the two reads 
into a single sequence, using the consensus nucleotide if both reads agreed, and the 
read with higher base quality in the case of disagreement. For each merged nuc- 
leotide, we assigned the base quality to be the higher of the two reads. We further 
used Seqprep to search for the expected adaptor sequences at either ends of the 
merged sequence, and to produce a trimmed sequence for alignment. 

We mapped all sequences using BWA-0.6.1 (ref. 35). For mitochondrial ana- 
lysis we mapped to the mitochondrial genome RSRS*. For whole-genome analysis 
we mapped to the human reference genome hg19. We restricted all analyses to 
sequences that had a mapping quality of MAPQ = 37. 

We sorted all mapped sequences by position, and used a custom script to search 

for mapped sequences that had the same orientation and start and stop positions. 
We stripped all but one of these sequences (keeping the best quality one) as 
duplicates. 
Mitochondrial sequence analysis and assessment of ancient DNA authenticity. 
For each library for which we had average coverage of the mitochondrial genome 
of at least tenfold after removal of duplicated molecules, we built a mitochondrial 
consensus sequence, assigning haplogroups for each library as described in Sup- 
plementary Information section 2. 

We used contamMix- 1.0.9 to search for evidence of contamination in the mito- 
chondrial DNA”. This software estimates the fraction of mitochondrial DNA 
sequences that match the consensus more closely than a comparison set of 311 
worldwide mitochondrial genomes. This is done by taking the consensus sequence 
of reads aligning to the RSRS mitochondrial genome, and requiring a minimum 
coverage of 5 after filtering bases where the quality was <30. Raw reads are then 
realigned to this consensus. In addition, the consensus is multiply aligned with the 
other 311 mitochondrial genomes using kalign (2.0.4)*’ to build the necessary 
inputs for contamMix, trimming the first and last 5 bases of every read to mitigate 
against the confounding factor of ancient damage. This software had difficulty 
running on data sets with higher coverage, and for these data sets, we down- 
sampled to 50,000 reads. 

For all sequences mapping to the mitochondrial DNA for which the consensus 
mitochondrial DNA sequence had a cytosine at the terminal nucleotide, we mea- 
sured the proportion of sequences with a thymine at that position. For population 
genetic analysis, we only used partially UDG-treated libraries with a minimum of 
3% CT substitutions as recommended by ref. 33. In cases where we used a fully 
UDG-treated library for 390k analysis, we examined mitochondrial capture data 
from a non-UDG-treated library made from the same extract, and verified that the 
non-UDG library had a minimum of 10% CT at the first nucleotide as recom- 
mended by ref. 38. Metrics for the mitochondrial DNA analysis on each library are 
given in Supplementary Data 1. 
390k capture, sequence analysis and quality control. For 390k analysis, we 
restricted to reads that not only mapped to the human reference genome hg19 
but that also overlapped the 354,212 autosomal SNPs genotyped on the Human 
Origins array*. We trimmed the last two nucleotides from each sequence because 
we found that these are highly enriched in ancient DNA damage even for UDG- 
treated libraries. We further restricted analyses to sites with base quality = 30. 

We made no attempt to determine a diploid genotype at each SNP in each sample. 
Instead, we used a single allele—randomly drawn from the two alleles in the 
individual—to represent the individual at that site**’. Specifically, we made an 
allele call at each target SNP using majority rule over all sequences overlapping the 
SNP. When each of the possible alleles was supported by an equal number of 
sequences, we picked an allele at random. We set the allele to ‘no call’ for SNPs 
at which there was no read coverage. 

We restricted population genetic analysis to libraries with a minimum of 0.06- 
fold average coverage on the 390k SNP targets, and for which there was an un- 
ambiguous sex determination based on the ratio of X to Y chromosome reads 
(Supplementary Information section 4 and Supplementary Data 1). For indivi- 
duals for whom there were multiple libraries per sample, we performed a series of 
quality control analysis. First, we used the ADMIXTURE software***! in super- 
vised mode, using Kharia, Onge, Karitiana, Han, French, Mbuti, Ulchiand Eskimo 
as reference populations. We visually inspected the inferred ancestry components 
in each individual, and removed individuals with evidence of heterogeneity in 
inferred ancestry components across libraries. For all possible pairs of libraries 
for each sample, we also computed statistics of the form D(Library,, Library 
Probe, Mbuti), where Probe is any of a panel of the same set of eight reference 
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populations), to determine whether there was significant evidence of the Probe 
population being more closely related to one library from an ancient individual 
than another library from that same individual. None of the individuals that we 
used had strong evidence of ancestry heterogeneity across libraries. For samples 
passing quality control for which there were multiple libraries per sample, we 
merged the sequences into a single BAM. 

We called alleles on each merged BAM using the same procedure as for the 
individual libraries. We used ADMIXTURE” as well as PCA as implemented in 
EIGENSOFT™ (using the Isqproject: YES option to project the ancient samples) to 
visualize the genetic relationships of each set of samples with the same culture label 
with respect to 777 diverse present-day West Eurasians*. We visually identified 
outlier individuals, and renamed them for analysis either as outliers or by the name 
of the site at which they were sampled (Extended Data Table 1). We also identified 
two pairs of related individuals based on the proportion of sites covered in pairs of 
ancient samples from the same population that had identical allele calls using 
PLINK*. From each pair of related individuals, we kept the one with the most SNPs. 
Population genetic analyses. We determined genetic sex using the ratio of X and 
Y chromosome alignments“ (Supplementary Information section 4), and Y chro- 
mosome haplogroup for the male samples (Supplementary Information section 4). 
We studied population structure (Supplementary Information sections 5 and 6). 
Weused f statistics to carry out formal tests of population relationships (Supplemen- 
tary Information section 6) and built explicit models of population history consistent 
with the data (Supplementary Information section 7). We estimated mixture pro- 
portions in a way that was robust to uncertainty about the exact population history 
that applied (Supplementary Information section 8). We estimated the minimum 
number of streams of migration into Europe needed to explain the data (Supplemen- 
tary Information sections 9 and 10). The estimated mixture proportions shown 
in Fig. 3 were obtained using the /sqlin function of Matlab and the optimization 
method described in Supplementary Information section 9 with 15 world outgroups. 
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Sample size. No statistical methods were used to predetermine sample size. 
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Extended Data Figure 1 | Outgroup f, statistic f,(Dinka; X, Y), measuring the degree of shared drift among pairs of ancient individuals. 
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Extended Data Figure 2 | Modelling Corded Ware as a mixture of N=1,2, _ the left show resnorm and on the right show the maximum |Z| score change for 
or 3 ancestral populations. a, The left column shows a histogram of raw fy different N. c, resnorm of different N = 2 models. The set of outgroups used in 
statistic residuals and on the right Z-scores for the best-fitting (lowest this analysis in the terminology of Supplementary Information section 9 is 
squared 2-norm of the residuals, or resnorm) model at each N.b, The dataon = ‘World Foci 15 + Ancients’. 
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Extended Data Figure 3 | Modelling Europeans as mixtures of increasing 
complexity: N= 1 (EN), N= 2 (EN, WHG), N= 3 (EN, WHG, Yamnaya), 
N=4 (EN, WHG, Yamnaya, Nganasan), N= 5 (EN, WHG, Yamnaya, 


Nganasan, BedouinB). The residual norm of the fitted model (Supplementary 
Information section 9) and its changes are indicated. 
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Extended Data Figure 4 | Geographic distribution of archaeological ancestry during the Middle Neolithic 7,000—5,000 years ago. ¢, Arrival of 
cultures and graphic illustration of proposed population movements / steppe ancestry in central Europe during the Late Neolithic ~4,500 years ago. 
turnovers discussed in the main text. a, Proposed routes of migration by early | White arrows indicate the two possible scenarios of the arrival of Indo- 
farmers into Europe ~9,000—7000 years ago. b, Resurgence of hunter-gatherer European language groups. Symbols of samples are identical to those in Fig. 1. 
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Extended Data Table 1 


First author 


Description 


No. samples at 20.05x 
coverage (enough for 
Procrustes analysis) 


Number of ancient Eurasian modern human samples screened in genome-wide studies to date 


No. samples at >0.25x 
coverage (enough to 
analyze in pairs) 


Keller Tyrolean Iceman 1 1 

Raghavan® Upper Paleolithic Siberians 2 1 

Olalde® Mesolithic Iberian from LaBrana 1 1 

Skoglund® Farmers and hunter-gatherers from Sweden a 2 

Lazaridis* Early European farmer from Germany & Mesolithic hunter-gatherers from v 4 

Luxembourg and Sweden 

Gamba? Neolithic, Bronze Age, Iron Age Hungary 13 2 

Fu! Upper Paleolithic Siberian from Ust-Ishim 1 

Seguin-Orlando’ — Upper Paleolithic European from Kostenki 1 1 

Total before study 37 20 
This study Hunter-gatherers and pastoralists from Russia, Mesolithic hunter-gatherers from 69 58 


Sweden, Early Neolithic from Germany, Hungary, and Spain, Middle Neolithic 
from Germany & Spain, Late Neolithic / Bronze Age from Germany 


Only studies that produced at least one sample at = 0.05 x coverage are listed. 
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Extended Data Table 2 | Summary of the archaeological context for the 69 newly reported samples 


Reich 1D Pop Label for Analysis __Culture ‘Group ___Location and sample details (e.g. sample, grave and museum ID) Date (lab no.) Country __Sex__mrhg Y-ng ‘Autosomal SNPs 
To0et Karelia_HG Russian Mesolithic ENG Yuzhnyy Oleni Ostrov, Karelia, Russia, UZ0074, grave 142, MAE RAS 5/73-74 5500-5000 BC Russia ™ Cig (formerly Cif) Riat 41564 
10124 Samara_HG Russian Neolithic HG EHG Sok River, Samara, Russia; SVP44 5650-5555 cal BC (Beta — 392490) Russia M Usaid Ribt 208748 
loot Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 1 5898-5531 cal BC Sweden F Usat 228271 
loo12 Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 2 5898-5531 cal BC Sweden = M U2e1 12c2 292853 
loots Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 3 5898-5531 cal BC Sweden = M Usat Iaib 251108 
1oo14 Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 4 5898-5531 cal BC Sweden F USa2d 311299 
1001s Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 6 5898-5531 cal BC Sweden = M USa2d lat 285307 
loots Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 9 5898-5531 cal BC Sweden = M U5a2 at 275233 
10017 Motala_HG Swedish Mesolithic SHG Motala, Sweden; Motala 12 5898-5531 cal BC Sweden = M U2e1 Iatb 337794 
10174 Starcevo EN Starcevo EN Alsonyék-Bataszék, Mémoki telep, Hungary; BAM25a, feature 1532 5710-5550 cal BC (MAMS 11939 ) Hungary = M Niatatb H2 101653 
10176 LBKT_EN — LBKT EN Szemely-Hegyes, Hungary: SZEH4b, feature 1001 5210-4940 cal BC (Beta - 310038) Hungary Niatata3 30718 
loo4s LBK_EN _LBK EN Halberstadt-Sonntagsfeld, Germany: HALS, grave 2, feature 241.1 5206-5004 cal BC (MAMS 21479) Germany F T2cid'et 266764 
10048 LBK_EN LBK EN Halberstadt-Sonntagsfeld, Germany: HAL25, grave 28, feature 861 5206-5052 cal BC (MAMS 21482) Germany M Kila G2a2a 123828 
10056 LBK_EN _LBK EN Halberstadt-Sonntagsfeld, Germany; HAL14, grave 15, feature 430 5206-5052 cal BC (MAMS 21480) Germany M T2b(8) G2a2a 136578 
10057 LBK_EN _LBK EN Halberstadt-Sonntagsfeld; HAL34, grave 38, feature 992 5207-5067 cal BC (MAMS 21483) Germany F Niatat 55802 
10100 LBK_EN _LBK EN Halberstadt-Sonntagsfeld; HAL4, grave 1, feature 139 5032-4946 cal BC (KIA40341) Germany F Niatata 342342 
10659 LBK_EN _LBK EN Halberstadt-Sonntagsfeld, Germany; HAL2, grave 35, feature 999 5079-4997 cal BC (KIA40350) Germany M Niata G2a2al 191007 

5066-4979 cal BC (KIA30408) 
10821 LBK_EN _LBK EN Halberstadt-Sonntagsfeld, Germany; HAL24, grave 27, feature 867 5034-4942 cal BC (KIA40348) Germany M Pre-X2d1 G2a2at 55914 
10795 LBK_EN _LBK EN Karsdorf, Germany; KAR6a, feature 170 5207-5070 cal BC (MAMS 22823) Germany M HI Tia 47804 
10054 LBK_EN LBK EN Oberwiederstedt-Unterwiederstedt, UWS4, Germany, grave 6, feature 1 14 5209-5070 cal BC (MAMS 21485) Germany F Jict7 337625 
loo22 LBK_EN _LBK EN Viesenhauser Hof, Stuttgart-Muhlhausen, Germany; LBK1976 5500-4800 BC Germany F T2e 160852 
10025 LBK_EN _LBK EN Viesenhauser Hof, Stuttgart-Mahlhausen, Germany; LBK1992 5500-4800 BC Germany F T2b 307686 
10026 LBK_EN _LBK EN Viesenhauser Hof, Stuttgart-Mahlhausen, Germany; LBK2155 5500-4800 BC Germany F T2b 315484 
10409 SpainEN — Els_Trocs EN Els Trocs, Spain; Troct 5311-5218 cal BC (MAMS 16159) Spain F Jic3 172903 
10410 SpainEN — Els_Trocs EN Els Trocs, Spain; Troc3 5178-5066 cal BC (MAMS 16161) Spain M pre-T2c1d2 Ribt 297595 
10444 Spain_EN_relative_of_10410 Els Trocs EN Els Trocs, Spain; Troc4 5177-5068 cal BC (MAMS 16162) Spain M Kia2a Fr 31507 
10412 SpainEN —Els_Trocs EN Els Trocs, Spain; Troc 5 5310-5206 cal BC (MAMS 16164) Spain M Niatat I2a1b1 333940 
10413 SpainEN — Els_Trocs EN Els Trocs, Spain; Troc7 5303-5204 cal BC (MAMS 16166) Spain F Vv 295844 
10405 Spain.MN La Mina MN La Mina, Spain; Mina3 3900-3600 BC Spain M Kiatbt I2atatlH2? 133230 
10406 Spain.MN La Mina MN La Mina, Spain; Mina4 3900-3600 BC Spain M Hi Ia2a1 324169 
10407 Spain.MN La Mina MN La Mina, Spain; Mina6b 3900-3600 BC Spain F Kibtat 236225 
10408 SpainMN La Mina MN La Mina, Spain; Mina18a 3900-3600 BC Spain F pre-USbti 321761 
10172 Esperstedt_MN —Salzmunde/Bernburg MN Esperstedt, Germany; ESP24, feature 1841 3360-3086 cal BC (Eri8699) Germany M T2b Iatbla 279147 
10569 Baalberge MN _Baalberge MN Quediinburg, Germany; QLB15D, feature 21033 3645-3537 cal BC (MAMS 22818) Germany M HV6'17 R? 64304 
10560 Baalberge MN _Baalberge MN Quediinburg, Germany; QLB18A, feature 21039 3640-3510 cal BC (Er7856) Germany F T2e1 133305 
10807 Baalberge MN _Baalberge MN Esperstedt, Germany; ESP30, feature 6220 3887-3797 cal BC (Er7784) Germany M Hieta FY 33481 
10234 Yamnaya  Yamnaya EBA Ekaterinovka,_Southem Steppe, Samara , Russia, SVP3 2910-2875 cal BC (Beta 392487) Russia M U4at Ribta2az 348142 
10367 Yamnaya  Yamnaya EBA Lopatino |, Sok_River, Samara, Russia; SVP5 same sample as SVP37 3090-2910 BC (Beta 392489) Russia F we 163845 
10370 Yamnaya  Yamnaya EBA Ishkinovka I, Eastern Orenburg, Pre-Ural steppe, Samara, Russia: SVP10 3300-2700 BC Russia M Hi3atata Ribta2a2 199345 
10429 Yamnaya — Yamnaya EBA Lopatino |, Sok River, Samara, Russia, SVP38 3339-2917 cal BC (AA47804) Russia M T2cta2 Ribta2a2 217664 
10438 Yamnaya  Yamnaya EBA Luzhki I, Samara River, Samara, Russia; SVP50 3021-2635 cal BC (AA47807) Russia M Usatat Ribta2a2 213493, 
10439 Yamnaya  Yamnaya EBA Lopatino |, Sok River, Samara, Russia, SVP52 3305-2925 cal BC (Beta 392491) Russia M Usatat Ribla 98900 
10444 Yamnaya  Yamnaya EBA Kurmanaevskii II, Buzuluk, Samara, Russia; SVP54 3010-2622 cal BC (AA47805) Russia F H2b 51326 
10443 Yamnaya  Yamnaya EBA Lopatino Il, Sok River, Samara, Russia; SVP57 3300-2700 BC Russia M W3ata Ribta2a 343890 
10444 Yamnaya  Yamnaya EBA Kutuluk |, Kutuluk River, Samara, Russia; SVP5B 3300-2700 BC Russia M H6a1b Ribta2a2 187126 
10550 Karsdor_LN unknown LN Karsdorf, Germany; KAR22a, feature 191 2564-2475 cal BC (MAMS 23344) Germany F Tiat 59907 
10103 Corded Ware LN Corded Ware LN Esperstedt, Germany; ESP16, feature 6236 2566-2477 cal BC (MAMS 21488) Germany F Wea 336918 
10049 Corded Ware LN Corded Ware LN Esperstedt, Germany, ESP22, feature 6140 2454-2291 cal BC (MAMS 21489) Germany F X2b4 167170 
10106 Corded Ware LN Corded Ware LN Esperstedt, Germany, ESP26, feature 6233.1 2454-2291 cal BC (MAMS 21490) Germany F T2aibt 69886 
10104 Corded Ware LN Corded Ware LN Esperstedt, Germany; ESP11, feature 6216 2473-2348 cal BC (MAMS 21487) Germany M U4biatat Riatat 336637 
10059 BenzigeradeHeimburg_LN Bell Beaker? LN Benzingerode-Heimburg, Germany; BZHG, grave 2, feature/find 1287/1036 2286-2153 cal BC (MAMS 21486) Germany F H1 /Hib'ad 241081 
10058 BenzigerodeHeimburg_LN Bell Beaker LN Benzingerode-Heimburg, Germany: BZH4, grave 7, feature 4607 2283-2146 cal BC (MAMS 21491) Germany F Hie 246728 
10174 BenzigerodeHeimburg_LN Bell Beaker? LN Benzingerode-Heimburg, Germany: BZH12, grave 3 feature 6256 2204-2136 cal BC (KIA27952 ) Germany F Usata2a 66800 
10112 Bell_Beaker_LN Bell Beaker LN Quediinburg XII, Germany; QUEXII6, feature 6256 2340-2190 cal BC (Er7038) Germany F H13ata2 341003 
101143 Bell_Beaker_LN Bell Beaker LN Quediinburg XII, Germany; QUEXII4, feature 6255.1 2290-2130 cal BC (Er7283) Germany F J1e5 190352 
10108 Bell_Beaker_LN Bell Beaker LN Rothenschirmbach, Germany, ROT6, feature 10044 2497-2436 cal BC (Er8710) Germany F H5a3, (260528 
10111 Bell_Beaker_LN Bell Beaker LN Rothenschirmbach, Germany; ROT4, feature 10142 2414-2333 cal BC (Er8712) Germany F H3new 208256 
10060 Bell_Beaker_LN Bell Beaker LN Rothenschirmbach, Germany; ROT3, feature 10011 2294-2206 cal BC (MAMS 22819) Germany F K1a2e 47085 
10806 Bell_Beaker_LN Bell Beaker LN Quediinburg Vil 2, Germany; QLB28b, feature 19617 2296-2206 cal BC (MAMS 22820) Germany M H1 Ribtazata2 91757 
10118 Alberstedt_LN unknown LN Alberstedt, Germany; ALB3, feature 7144.2 2459-2345 cal BC (MAMS 21492) Germany F HV6"17 349956 
10144 Unetice_EBA relative of 10117  Unetice EBA Esperstedt, Germany; ESP2, feature 3340.1 2131-1979 cal BC (MAMS 21493) Germany  M 1a 12a2 217031 
10115 Unetice EBA —_Unetice EBA Esperstedt, Germany, ESP3, feature 1559.1 1931-1780 cal BC (MAMS 21494) Germany F Usat 123744 
10116 Unetice_EBA —Unetice EBA Esperstedt, Germany, ESP4, feature 3322/3323 2118-1961 cal BC (MAMS 21495) Germany M W3a1 12c2 308158 
10117 Unetice_EBA —Unetice EBA Esperstedt, Germany; ESP29, feature 3332/3333 2199-2064 cal BC (MAMS 21496) Germany F Ba 279996 
10164 Unetice EBA —_Unetice EBA Quediinburg Vill, Germany; QUEVII6, feature 3580 2012-1919 cal BC (MAMS 21497) Germany F pre-USb2a1b 332832 
10803 Unetice_EBA —Unetice EBA Eulau, Germany; EUL41A, feature 882 2115-1966 cal BC (MAMS 22822) Germany F H4aiat 144186 
10804 Unetice EBA —Unetice EBA Eulau, Germany; EUL57B, feature1911.5 2131-1982 cal BC (MAMS 22821) Germany M H3 12 22869 
10047 Unetice EBA —_Unetice EBA Halberstadt-Sonntagsfeld, Germany; HAL‘6, grave 19, feature 613.1 2022-1937 cal BC (MAMS 21481) Germany F Vv 288353 
loose Halberstadt_LBA Late Bronze Age LBA Halberstadt-Sonntagsfeld_ Germany; HAL36C. grave 40, feature 1114 1113-1021 cal BC (MAMS 21484 Germany __M H23 Riatatb1a2 337566 


Samples with direct radiocarbon dates are indicated by a calibrated date “cal 8c” along with associated laboratory numbers. Dates that are estimated based on faunal elements associated with the samples are not 


indicated with ‘cal’ (although they are still calibrated, absolute dates). 
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Extended Data Table 3 


other groups 


Armenian 
Baalberge_MN 
Basque 

Bedouin8 
Belarusian 
Bell_Beaker_LN 
BenzigerodeHeimburg_LN 
Bergamo 

Bulgarian 
Corded_Ware_LN 
Croatian 

Czech 

EHG 

English 

Estonian 

French 

Greek 

Han 

Hungarian 
HungaryGamba_8A 
HungaryGamba_EN 
Icelandic 

lraqi_Jew 

Kanitiana 

LBK_EN 

Lezgin 


Spain_EN 
Spain_MN 

Spanish 
‘SwedenSkoglund_NHG 
Turkish 

Unetice EBA 

WHG 

Yamnaya 

Yoruba 


Armenian 


0.023 
0.017 
0.025 
0.014 
0.017 
0.019 
0.007 
0.006 
0. 
0.009 
0.011 
0.067 
0.011 
0.017 
0.009 
0.004 
0.108 
0.009 
0.016 
0.016 
0.015 
0.005 
0.210 
0.023 
0.005 
0.019 
0.045 
0.014 
0.088 
0.014 
0.168 
0.014 
0.186 
0.015 
0.014 
0.003 
0.016 
0,034 
0.033 
0.008 
0.071 
0.001 
0.016 
0.086 
0.030 
0.142 
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Baalbege_MN 


0.010 
0.009 
0.060 
0.008 
0.014 
0.006 
0.010 
0.120 
0.009 
0.010 
0.018 
0.011 
0.021 
0.218 
0.024 
0.018 
0.014 
0.061 
0.014 
0.072 
0.010 
0.181 
0.011 
0.199 
0.014 
0.013 
0.011 
0.032 
0.030 
0.025 
0.005 
0.058 
0.014 
0.014 
0.062 
0.034 
0.155 


Pairwise Fs; for all ancient groups with = 2 individuals, present-day Europeans with = 10 individuals, and selected 


LN 


Bell 


0.001 
0.005 
0.001 
0.001 
0.001 


0.004 
0.008 
0.008 
0.008 
0.007 
0.007 
0.041 
0.006 
0.009 
0.006 
0.010 
0114 
0.007 
0.008 
0.020 
0.008 
0.022 
0.208 
0.021 
0.015 
0.010 
0.055 
0.010 
0.057 
0.007 
0.174 
0.009 
0.194 
0.010 
0.016 
0.011 
0.026 
0.028 
0.022 
0.007 
0.050 
0.013 
0.002 
0.055 
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Reducing the energy cost of human walking using an 


unpowered exoskeleton 


Steven H. Collins'*, M. Bruce Wiggin?* & Gregory S. Sawicki** 


With efficiencies derived from evolution, growth and learning, humans 
are very well-tuned for locomotion’. Metabolic energy used during 
walking can be partly replaced by power input from an exoskeleton’, 
but is it possible to reduce metabolic rate without providing an addi- 
tional energy source? This would require an improvement in the 
efficiency of the human-machine system as a whole, and would be 
remarkable given the apparent optimality of human gait. Here we 
show that the metabolic rate of human walking can be reduced by an 
unpowered ankle exoskeleton. We built a lightweight elastic device 
that acts in parallel with the user’s calf muscles, off-loading muscle 
force and thereby reducing the metabolic energy consumed in con- 
tractions. The device uses a mechanical clutch to hold a spring as it is 
stretched and relaxed by ankle movements when the foot is on the 
ground, helping to fulfil one function of the calf muscles and Achilles 
tendon. Unlike muscles, however, the clutch sustains force passively. 
The exoskeleton consumes no chemical or electrical energy and deli- 
vers no net positive mechanical work, yet reduces the metabolic cost 
of walking by 7.2 + 2.6% for healthy human users under natural 
conditions, comparable to savings with powered devices. Improving 
upon walking economy in this way is analogous to altering the struc- 
ture of the body such that it is more energy-effective at walking. While 
strong natural pressures have already shaped human locomotion, 
improvements in efficiency are still possible. Much remains to be 
learned about this seemingly simple behaviour. 

Humans are skilled walkers. Over generations, our bodies have evolved 
muscular’, skeletal’ and neural* systems well-suited to locomotion. We 
learn and embed walking coordination strategies over our lifetimes” 
and adapt to new locomotor environments in minutes or seconds®. We 
take about 10,000 steps per day’, or hundreds of millions of steps in a 
lifetime, exceeding the approximately 10,000 h of practice thought to 
be needed to attain expertise® by adulthood. We naturally keep energy 
expenditure low during walking, choosing, for example, step length’ 
and even arm motions” that minimize energy cost. Nearly any change 
to the human musculoskeletal system or its pattern of coordination 
increases metabolic rate. Despite this skill and efficiency, getting about 
is still expensive. People expend more energy during walking than any 
other activity of daily life’, and fatigue can limit mobility. Herein lies 
the challenge: reducing the effort of normal walking could garner sub- 
stantial benefits, but humans are already so energy-effective that mak- 
ing improvements is extremely difficult. 

Since at least the 1890s'’, engineers have designed machines intended 
to make walking easier'’"’*. A survey of these designs can be found in 
the Supplementary Discussion. It is only recently that any attempt at 
reducing the energy cost of walking with an external device has met 
with success. The first machine to do so used off-board pneumatic 
pumps and valves to replace human joint work with exoskeleton work’, 
overcoming the surprisingly tricky challenge of coordinating assistance 
with the human neuromuscular system. More recently still, a powered 
and untethered device using similar control strategies succeeded in 


reducing energy cost'®, overcoming the additional challenge of auto- 
nomous packaging. 

Reducing the energy cost of walking with an unpowered device requires 
a different approach. Instead of adding a robotic energy source to replace 
metabolic sources, one must, in a sense, change the human body such 
that it is more efficient at locomotion (Extended Data Fig. 1). For the 
task of carrying heavy loads while walking, such improvements have 
been demonstrated using a spring-mounted backpack” and by train- 
ing people to balance the weight on their head in just the right way’*. 
But is there room for a similar improvement in the already expert task 
of normal walking? 

The possibility of unpowered assistance is made more likely by the 
fact that level walking at steady speed requires no power input in theory, 
and therefore all energy used in this activity is, in a sense, wasted. Sim- 
ulation models with spring-loaded legs illustrate this idea’; their springs 
store and return energy during each step, but no mechanical work is done 
by actuators, capitalizing on the fact that the kinetic and potential energy 
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Figure 1 | Unpowered exoskeleton design. a, The exoskeleton comprises 
rigid sections attached to the human shank and foot and hinged at the ankle. 
A passive clutch mechanism and series spring act in parallel with the calf 
muscles and Achilles tendon. b, Participant walking with the device. Load cells 
measured spring force. c, The passive clutch mechanism has no electronics, 
but instead uses a ratchet and pawl that mechanically engage the spring when 
the foot is on the ground and disengage it when the foot is in the air. 
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of the body remain constant on average. Humans expend metabolic 
energy during walking in part to restore energy that has been dissip- 
ated, in passive motions of soft tissues” for example, but the greatest 
portion of waste occurs in muscles. Muscles consume metabolic energy 
to perform positive work, as required by conservation of energy, but 
they also use metabolic energy to produce force isometrically and to 
perform negative work”. This places a metabolic cost on body weight 
support” and on holding tendons as they stretch and recoil’’. By con- 
trast, mechanical clutches require no energy to produce force. 

We designed a lightweight exoskeleton that provides some of the 
functions of the calf muscles and tendons during walking, but uses more 
efficient structures for those tasks. It has a spring in parallel with the 
Achilles tendon (Fig. 1a) connected to the leg using a lightweight com- 
posite frame with a lever about the ankle joint (Fig. 1b and Extended 
Data Fig. 2). A mechanical clutch in parallel with the calf muscles engages 
the spring when the foot is on the ground and disengages it to allow free 
motion when the foot is in the air (Fig. lc and Supplementary Video 1). 
This design was inspired by ultrasound imaging studies suggesting clutch- 
like behaviour of muscle fascicles to hold the spring-like Achilles tendon”, 
the recoil of which leads to the largest burst of positive mechanical 
power at any joint during walking. The exoskeleton clutch, described 
in detail in the Supplementary Methods and Supplementary Video 2, 
has no motor, battery or computer control, and weighs 0.057 kg. The 
entire exoskeleton has a mass of between 0.408 and 0.503 kg per leg, 
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depending on participant size (Extended Data Tables 1 and 2). On the 
basis of simulation studies of walking with elastic ankles'*”*, we expected 
an intermediate stiffness to minimize energy cost and performed tests 
with a range of springs. 

We conducted experiments with healthy participants (N = 9) wearing 
an exoskeleton on each leg while walking at a normal speed (1.25ms_') 
on a treadmill. The exoskeleton produced a pattern of torque similar 
to that produced by the biological ankle, but with lower magnitude 
(Fig. 2a). This reduced the ankle moment produced by calf muscles 
(Fig. 2b) and reduced calf muscle activation, particularly in the soleus 
(Fig. 2c). Joint angles changed little across conditions (Fig. 2d), con- 
firming that the exoskeleton did not interfere with other normal ankle 
functions, such as toe clearance during leg swing (60-100% stride). 

The exoskeleton reduced human metabolic energy consumption 
when using moderate-stiffness springs (Fig. 3). Wearing a lightweight 
exoskeleton on each ankle without springs did not measurably increase 
energy cost compared with normal walking. With increasing spring 
stiffness, metabolic rate first decreased then increased, supporting 
the hypothesis that an intermediate stiffness would be optimal. The 
180N mrad ' spring reduced the metabolic cost of walking to 2.67 
+0.14W kg! (mean = standard error), down from 2.88 + 0.10 W kg 
for normal walking, a reduction of 7.2 + 2.6% (paired t-test: P = 0.023). 
Metabolic energy used for walking, or net metabolic rate, is calculated 
as total metabolic rate minus the rate for quiet standing, which was 
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Figure 2 | Mechanics and muscle activity. a, Exoskeleton torque (normalized 
to body mass) in time (normalized to stride period) for each spring, averaged 
across participants. Bars at right are the averages of these trajectories in time; 
N= 9; error bars, s.e.m.; P values indicate the results of analysis of variance 
(ANOVA) tests for an effect of spring stiffness; NE, no exoskeleton. 
Exoskeleton torque increased with spring stiffness (except with the stiffest 
spring, which tended to be engaged later in stance). b, Time course of the 
biological contributions to ankle moment, which decreased with increasing 


spring stiffness. c, Time course of electrical activity in the soleus muscle, an 
ankle plantarflexor, which decreased with increasing spring stiffness. d, Time 
course of ankle joint angle, which triggered passive clutch engagement and 
disengagement. The ratchet was engaged at heel strike, took up slack through 
foot flat, held the spring as it stretched and recoiled through mid- and late 
stance, and disengaged to allow toe clearance during leg swing. The average 
stride period was 1.15 + 0.08 s (mean = s.d.). 


11 JUNE 2015 | VOL 522 | NATURE | 213 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


*P = 0.023, paired t-test 
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Figure 3 | Human metabolic rate. Spring stiffness affected metabolic rate 
(N = 9; ANOVA with second-order model; Pytigfess = 0.016, Petitiness? = 0-008). 
Net metabolic rate, with the value for quiet standing subtracted out, was 

7.2 + 2.6% (mean = s.e.m.) lower with the 180 N mrad! spring (orange bar) 
than during normal walking (dark grey bar; paired two-sided t-test with 
correction for multiple comparisons; P = 0.023). The dashed line is a quadratic 
best fit to mean data from exoskeleton conditions (R* = 0.91, P = 0.029). 
Wearing the exoskeleton with the spring removed (light grey bar, k = 0) did not 
increase energy cost compared with normal walking (paired t-test; P = 0.9). 
Error bars, s.e.m., dominated by inter-participant variability. 


1.47 +0.1Wkg | in this study. The observed reduction is similar to 
improvements with high-powered devices”"* and equivalent to the effect 
of taking off a 4kg backpack for an average person”®. 

It is difficult to attribute changes in whole-body metabolic rate to a 
particular change in muscle mechanics”’, but with this device there is 
an association with reduced muscle forces at the assisted ankle joints. 
Muscles consume energy whenever active, even when producing force 
without performing mechanical work. Simply reducing muscle force can 
therefore save metabolic energy. For all exoskeleton springs, we mea- 
sured reductions in the biological component of ankle moment and the 
activity of major plantarflexor muscles, both indicative of reduced force. 
Reductions occurred primarily during early and mid-stance (0-40% 
stride, Fig. 2b, c) when muscle fascicles are nearly isometric and there- 
fore perform little mechanical work”. Simulation models estimate that 
plantarflexor muscle energy use primarily occurs during this period 
and accounts for about 27% of the metabolic energy used for walking”. 
With the 180N mrad ' spring, the biological component of average 
ankle moment was reduced by 14% and mid-stance soleus electrical 
activity was reduced by 22% compared with normal walking. Extrap- 
olating from these values, one might expect about a 4-6% reduction in 
overall metabolic rate, comparable to the observed 7% reduction. 

Biological contributions to ankle joint work were also partly replaced 
by the exoskeleton, but it is unlikely that these changes were responsible 
for reductions in metabolic rate. The connections between joint work, 
musculotendon work, muscle fascicle work and metabolic rate are 
complex. Much of the mechanical work at the ankle joint during walk- 
ing is the result of elastic stretch and recoil of the Achilles tendon”, 
which does not directly consume metabolic energy. Because of tendon 
compliance, using an exoskeleton to reduce cyclic musculotendon work 
can actually preserve or increase the mechanical work performed by 
muscle fascicles**—reducing tendon force reduces its stretch, which 
can lead to increased excursion of the muscle itself and more muscle 
work. Even if reduced joint work had been the result of reduced muscle 
fascicle work, under these circumstances such a change would prob- 
ably not have reduced metabolic cost. It has recently been shown that 
for contraction cycles similar to those of the calf muscles during normal 
walking, where muscle fascicles undergo stretch-shorten cycles with 
nearly zero net work, making equal and opposite changes to both negative 
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and positive work has no effect on metabolic energy use per unit force”. 
Our understanding of the relationship between muscle activity and meta- 
bolic rate remains imperfect, but reduced muscle work does not seem 
to provide a good explanation for reduced metabolic cost in this study. 

Metabolic rate increased back to normal levels when using high- 
stiffness exoskeleton springs, apparently the result of several factors. 
Humans tend to select coordination patterns with similar net ankle 
moments across a range of exoskeleton torques**’, a trend also observed 
here. With stiff springs, tibialis anterior activity counteracting exo- 
skeleton torque in early and mid-stance appeared to increase, possibly 
reducing changes in total joint moment. Knee muscle activity to pre- 
vent hyperextension during mid- and late stance may also have contri- 
buted to increases in metabolic cost. Unexpectedly, some of the increase 
in metabolic rate appears to be associated with increased plantarflexor 
activity at the end of stance. Furthermore, despite being more active 
during this period, plantarflexor muscles produced lower joint moments. 
These reduced moments probably reflect increased contraction velocity, 
because muscle force drops rapidly as the rate of shortening increases. 
These two observations suggest that exoskeleton support during mid- 
stance led to inefficient, rapid shortening of plantarflexor muscles 
during the usual burst of positive work at the end of the step. Also unex- 
pectedly, it does not appear that the increase in metabolic rate with 
high-stiffness springs is well explained by simple dynamic models of 
walking, which predict changes in centre-of-mass work that were not 
observed here'**. These and other interpretations are presented in 
expanded form in the Supplementary Discussion and can be explored 
using joint mechanics, muscle activity and centre-of-mass mechanics 
data presented in Extended Data Figs 3-8. 

The complexity of the neuromuscular system can impede useful appli- 
cation of simple ideas from mechanics and robotics to human locomo- 
tion. For example, it is tempting to equate joint work or centre-of-mass 
work with metabolic energy use. However, the benefits derived from 
reduced muscle activity with this unpowered exoskeleton would not 
have been discovered using joint-level power estimates as a guide, since 
these draw attention towards terminal stance and away from early and 
mid-stance when joint power is negative and of low magnitude. The 
increased metabolic rate at higher exoskeleton spring stiffness found 
here also cannot be explained using mechanical power, because human 
contributions decreased or remained suppressed with increasing stiff- 
ness. The complex neuromuscular factors underlying these changes make 
effective integration of assistive devices very challenging and may explain 
why the threshold of reducing the metabolic rate of normal walking, 
with””* or without additional power input, has taken more than a cen- 
tury to cross. Much remains to be learned about human coordination, 
even in this seemingly uncomplicated activity. 

We have demonstrated that net energy input is not a fundamental 
requirement for reducing the metabolic cost of human walking. Reduc- 
ing calf muscle forces—while also fulfilling normal ankle functions and 
minimizing penalties associated with added mass or restricted motions— 
can provide a benefit. Passive clutch-like structures are feasible in nature, 
making the use of this type of device analogous to a change in anatomy 
that improves walking economy. Similar morphological changes might 
augment other lower-limb musculature or locomotion in other animals. 
While evolution, growth and learning have driven efficiency, improve- 
ments are yet possible. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Participants. Nine healthy adults (N = 9, 2 female, 7 male; age = 23.0 + 3.7 years; 
mass = 77.4 + 9.2 kg; height = 1.84 + 0.10 m; mean + s.d.) participated in the study. 
One additional participant dropped out before completing the protocol, in part 
owing to hardware malfunctions during training sessions. Sample size was chosen 
on the basis of data from previous studies; no statistical methods were used to 
predetermine sample size. All participants provided written informed consent 
before participation and separately provided written consent to publish de-identified 
photographs (as applicable). The study protocol was approved and overseen by the 
Institutional Review Board of the University of North Carolina at Chapel Hill. 
Exoskeleton hardware. Custom frames were fabricated for each participant using 
modified orthotics methods. A flexible cast was used to create a positive plaster 
mould of the foot, ankle and shank, upon which a thin, selectively reinforced carbon 
fibre frame was formed. Shank and foot segments were removed from the mould 
and connected using an aluminium hinge joint with a plain bearing (Extended Data 
Fig. 2). The custom mechanical clutch*'*’ (Fig. 1c and Supplementary Methods) 
was then integrated with the frame. Part drawings and CAD files are provided as 
Supplementary Data 1 and 2, a detailed accounting of component mass and com- 
parisons with other systems are provided in Extended Data Tables 1 and 2, anda 
demonstration of clutch function can be found in Supplementary Video 2. 

We used five sets of steel coil extension springs with stiffnesses of 5.6, 7.9, 10.5, 
13.3and 17.2 kN m ‘and masses of 0.059, 0.061, 0.068, 0.092 and 0.098 kg, respec- 
tively. Spring stiffnesses were determined in experiments where springs were stretched 
to several displacements using a fixture and forces were measured using a load cell. 
Springs were attached to a lever arm on the foot frame with an average radius of 
0.152 m, resulting in average exoskeleton rotational stiffnesses of 130, 180, 240, 310 
and 400N mrad '. This spans the range of reported ankle joint quasi-stiffnesses 
for walking’. To measure force, a single-axis load cell (LC8125-312-500, Omega 
Engineering) was placed in series with the spring. Exoskeleton joint torque was cal- 
culated as the product of spring force and the lever arm, assuming constant leverage. 

The effective stiffness experienced by participants was lower than that indicated 
by the springs themselves. In a follow-up experiment with a single participant, 
quasi-static loading of the exoskeleton, and additional markers on the exoskeleton 
frame, compliance in the frame and rope led to about an 18% decrease in effective 
stiffness, while compliance at the human-exoskeleton interface led to an additional 
decrease of about 15%. The effective mechanical stiffness of the exoskeleton, when 
clutched, was therefore probably about 33% lower than indicated by the springs 
alone. Such effects probably varied across participants, being dependent both on 
frame construction and on individual human characteristics. 

Walking trials. Participants walked on a treadmill at 1.25 ms_' under seven con- 
ditions: normal walking without the exoskeleton (No Exoskeleton or NE); walking 
with the complete exoskeleton but no spring connected (No Spring or k = 0); and 
walking with each of the springs attached (exoskeleton spring stiffness k = 130, 
180, 240, 310 and 400 Nm rad '). In previous studies, humans have taken about 
20 min to adapt fully to tethered pneumatic ankle exoskeletons*. To allow suf- 
ficient time for learning, participants completed 21 min of training under each 
condition over three or four walking sessions before data collection. During train- 
ing, participants walked under each condition for 7 min. Mechanical failure of the 
clutch occurred for some conditions during some training sessions, resulting in 
more collection sessions for some participants, but an equal amount of training 
(21 min) with a functioning exoskeleton for all participants and conditions. Data 
were collected during minutes 5-7 ofa final 7 min session, or minutes 26-28 of the 
multi-day experiment. The order of presentation of conditions was randomized for 
each participant on the first collection day and then held constant for that par- 
ticipant over the remainder of the experiment. This ensured that each participant's 
training progress was not confounded by ordering effects. Blinding was not prac- 
tical in this protocol. 

Biomechanics and energetics measurements. Body segment motions were mea- 
sured using a reflective marker motion capture system (eight T-Series cameras, 
Vicon). Ground reaction forces were measured using a treadmill instrumented with 
load cells (Bertec). Ankle muscle activity (soleus, medial and lateral gastrocnemius, 
tibialis anterior) was measured using a wired electromyography system (SX230, 
Biometrics). Whole-body oxygen consumption and carbon dioxide production 
were measured using an indirect calorimetry system (Oxycon Mobile, CareFusion). 
Data analysis. Joint angles, moments and powers were calculated from body motions 
and ground reaction forces using inverse kinematics and inverse dynamics analyses” 
(Visual 3D, C-Motion). Components of joint moment and power attributed to the 
human (biological component) were calculated**”’ by subtracting the exoskeleton 
torque or power, measured using onboard sensors, from the total ankle joint moment 
or power, estimated using inverse dynamics. Centre-of-mass power was calculated 
from ground reaction forces using the individual limbs method**. Muscle activity 
was band-pass filtered (20-460 Hz) in hardware and then conditioned by rectifying 


and low-pass filtering with a cutoff frequency of 6 Hz in software. Medial and 
lateral gastrocnemius signals were combined to simplify analysis and interpreta- 
tion. Metabolic rate was estimated from average rates of oxygen consumption 
(Vo,) and carbon dioxide production (Vco,) during the collection window using a 
standard formula”. The metabolic rate during quiet standing was subtracted from 
gross metabolic rate to obtain the net value attributable to the energetic demands of 
walking”!°'°°??°, Net metabolic rate values were then normalized to participant 
body mass. 

Mechanics data and muscle activity from each condition were broken into strides, 
determined as the period between subsequent heel strikes of a single leg, and an 
average stride for each participant and condition was obtained. These average 
strides were used to calculate values of average moment, mechanical power and 
muscle activity for each participant and condition. Average moment and power 
values were calculated as the time integral of moment and power time series data 
divided by stride period. Positive and negative average joint moments and powers 
were separated out using time integrals of periods of positive or negative moment 
or power, respectively. Average net power was calculated as the time integral of power 
over the whole stride period. Average moment and power values were normalized to 
participant body mass. Average muscle activity was calculated as the time integral 
of muscle activity divided by stride period. Average muscle activity during addi- 
tional periods of interest was calculated as the time integral of muscle activity 
during those periods divided by stride period (for example, early and mid-stance, 
defined as 0-40% stride, and late stance, defined as 40-60% stride). Muscle activity 
was normalized to the maximum value observed during normal walking for each 
muscle and for each participant. For each condition, study-wide average traject- 
ories of lower-limb joint angles, moments and powers were calculated by averaging 
across participants, used for display purposes in Fig. 2 and Extended Data Figs 3-8. 
Statistics. For each condition, means and standard errors of net metabolic rate, 
average moment, average mechanical power and average muscle activity outcomes 
were calculated across participants, with standard error indicating inter-participant 
variability. On the basis of the expectation that user performance would be a non- 
linear function of exoskeleton stiffness”, we conducted a mixed-model, three-factor 
ANOVA (random effect: participant; fixed effects: spring stiffness and square of 
spring stiffness) to test for an effect of spring stiffness across exoskeleton conditions 
(significance level « = 0.05; JMP Pro, SAS). For the primary outcome measure, net 
metabolic rate, stiffness had a significant effect. We used paired t-tests with a Sidak-— 
Holm correction for multiple comparisons*’ to compare spring conditions with 
each other and with the ‘No Exoskeleton’ condition to identify which exoskeleton 
springs exacted a significant change in metabolic rate. We used a Jarque-Bera two- 
sided goodness-of-fit test to confirm applicability of tests that assume a normal 
distribution. For the primary outcome measure, net metabolic rate, we also used a 
least-squares regression to fit a second-order polynomial (quadratic) function relat- 
ing mean outcome data to exoskeleton spring stiffness. Additional two-factor 
ANOVA analyses (random effect: participant; fixed effect: spring stiffness) were 
performed to test for an effect of spring stiffness across exoskeleton conditions for 
secondary outcomes in joint mechanics, centre-of-mass mechanics and muscle 
activity. These results are compiled in Supplementary Table 1. 
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Extended Data Figure 1 | Energy diagrams for human-exoskeleton 
walking. Each diagram includes energy inputs, outputs, storage and transfers 
within the mechanical system, depicted for steady-state walking. In each case, 
all chemical or electrical energy input is eventually output as heat, since the 
mechanical energy of the system is constant on average and no useful work is 
performed on the body or the environment. Energy efficiency, strictly 
defined, is therefore zero in all cases, and so energy effectiveness or energy 
economy is instead characterized in terms of ‘cost of transport’, which is the 
energy used per unit weight per unit distance travelled*. a, Energy diagram 
for normal human walking. Muscles consume metabolic energy both to 
produce mechanical work and to absorb it (and to perform a variety of other 
functions, such as activating or producing force), and so metabolic energy flows 
only into the system. Energy loss in muscle manifests as heat. Inside the 
mechanical system, tendons exchange energy with both the muscle and the 
body, while kinetic and gravitational potential energy are exchanged within the 
body segments, all at high mechanical efficiency. Body segment mechanical 
energy is dissipated only in damping in soft tissues, for example during 
collisions, which is small (about 3% of the total metabolic energy input”’), and 
in friction from slipping of the feet against the ground, deformation of the 
ground or air resistance, all of which are negligible under typical conditions. 
All of these mechanical losses manifest as heat. b, Energy diagram for walking 
with a powered exoskeleton. An additional energy input is provided in the 
form of, for example, electricity. The total energy input (and corresponding 
eventual dissipation) of the system can therefore increase, even if a smaller 


economy may be worsened) 


energy use (e.g. 3% in the case of 
damping in body segments”) 


portion is borne by the human, resulting in poorer overall energy economy. 
This has been the case with the two powered devices that have reduced the 
metabolic energy cost of human walking””®. In theory, overall energy economy 
could still be improved with a powered device in three ways. First, positive 
mechanical work from muscles could be replaced by work done by a motor with 
higher efficiency. Second, negative mechanical work could be replaced by 
generation done by a motor with higher (than — 120%) efficiency, thereby 
usefully recapturing energy that would otherwise be dissipated as heat. In fact, 
because muscle expends metabolic energy to absorb mechanical work, it is 
theoretically possible to simultaneously reduce metabolic rate and capture 
electrical energy with zero electrical input*’, although this has yet to be 
demonstrated in practice. Third, the powered device could approximate an 
unpowered device, with negligible amounts of electricity used only to control 
the timing of mechanical elements such as clutches*’. c, Energy diagram for 
walking with an unpowered exoskeleton. No additional energy supply is 
provided; so, unlike the powered case, the only way to decrease metabolic 
energy use is to reduce total system energy dissipation, or, equivalently, to 
improve the energy economy of the system as a whole. Note that the only 
difference from normal human walking, in terms of energy flow, is the addition 
of elements such as springs that store and transfer mechanical energy within 
the system. In this sense, reducing metabolic rate with a passive exoskeleton is 
akin to changing the person’s morphology such that it is more energy-effective 
at locomotion. 
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Extended Data Figure 2 | Exoskeleton frame design. A rigid carbon fibre 
shank frame and foot frame were custom-made for each participant. The shank 
section clamps onto the user’s lower leg just below the knee and connects to 
the foot frame through a rotary joint at the ankle. The foot frame includes a 
lever arm protruding to the rear of the heel, to which the parallel spring is 
connected. The clutch is mounted to the shank frame posterior to the 

calf muscles. 
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Extended Data Figure 3 | Ankle moment contributions. a, Total ankle 
moment, measured using a motion capture system. Average total ankle 
moment (b) during the entire stride and (c) during early and mid-stance, 
defined as 0-40% stride, and (d) peak ankle moment. All spring conditions 
increased average total joint moment slightly during early stance, but peak total 
joint moment was maintained across conditions. e, Exoskeleton torque 
contribution, as measured using onboard sensors. Average exoskeleton 
torque (f) during the entire stride and (g) during early and mid-stance, 
defined as 0-40% stride, and (h) peak exoskeleton torque. Average and peak 
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exoskeleton torque increased with increasing exoskeleton spring stiffness, 
except with the highest stiffness spring. i, Biological contributions to ankle 
moment, calculated as the subtraction of the exoskeleton moment from the 
total moment. Average biological ankle moment (j) during the entire stride 
and (k) during early and mid-stance, defined as 0-40% stride, and (1) peak ankle 
moment. Ankle moments arising from muscle activity decreased with 
increasing exoskeleton spring stiffness, but with diminishing returns at high 
spring stiffness. N = 9; bars, mean; error bars, s.e.m.; P values, two-factor 
ANOVA (random effect: participant; fixed effect: spring stiffness). 
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Extended Data Figure 4 | Ankle muscle activity. a, Activity in the soleus,a increasing spring stiffness during late stance. i, Activity in the tibialis anterior, a 


mono-articular muscle group that acts to plantarflex the ankle. Average mono-articular muscle group that acts to dorsiflex the ankle. Average tibialis 
soleus activity over (b) the whole stride, (c) early and mid-stance, defined as anterior activity over (j) the whole stride, (k) early and mid-stance, defined as 
0-40% stride, and (d) late stance, defined as 40-60% stride. Soleus activity 0-40% stride, and (1) late stance, defined as 40-60% stride. Tibialis anterior 
decreased with increasing spring stiffness. e, Activity in the gastrocnemius, a _activity seemed to increase during early and mid-stance, and was unchanged 
biarticular muscle group that acts to plantarflex the ankle and flex the knee. during late stance. All values were measured using electromyography and 
Average gastrocnemius activity over (f) the whole stride, (g) early and normalized to maximum activity during normal walking. N = 8; bars, mean; 
mid-stance, defined as 0-40% stride, and (h) late stance, defined as 40-60% error bars, s.e.m.; P values, two-factor ANOVA (random effect: participant; 
stride. Gastrocnemius activity was reduced compared with the ‘No fixed effect: spring stiffness). 


Exoskeleton’ condition during early and mid-stance, but increased with 
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Extended Data Figure 5 | Ankle power contributions. a, Mechanical power _ power, (g) average negative exoskeleton power and (h) average net exoskeleton 
of the combined human-exoskeleton system, measured using a motion capture _ power. Net exoskeleton power was always negative. i, Biological ankle power, 


system, (b) average positive power, defined as positive work divided by defined as the subtraction of exoskeleton power from total ankle power, 
stride time, (c) average negative power, defined as negative work divided by (j) average positive biological power, (k) average negative biological power 
stride time, and (d) average net power, equivalent to average power, defined as _and (1) average net biological power. Net biological power increased with the 
the sum of positive and negative work divided by stride time. Total positive exoskeleton compared with normal walking. N = 9; bars, mean; error bars, 
ankle joint power decreased with increasing stiffness, while net joint power s.e.m.; P values, two-factor ANOVA (random effect: participant; fixed effect: 


increased. e, Exoskeleton power, measured using onboard sensors for torque _ spring stiffness). 
and motion capture for joint velocity, (f) average positive exoskeleton 
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Extended Data Figure 6 | Knee moment. a, Knee moment in time as within approximately 30-50% stride divided by stride period. Average knee 

measured by motion capture, (b) average absolute knee moment over the entire moment during late stance increased in magnitude with the highest stiffness 
stride, (c) average knee moment during early stance, defined as the positive springs. Positive values denote knee extension. N = 9; bars, mean; error bars, 
impulse within approximately 10-30% stride divided by stride period, and s.e.m.; P values, two-factor ANOVA (random effect: participant; fixed effect: 


(d) average knee moment during late stance, defined as the negative impulse _ spring stiffness). 
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Extended Data Figure 7 | Hip, knee and ankle joint mechanics. Joint angles, | Extended Data Figs 6 and 3, respectively. g, Hip joint power, (h) knee joint 
moments and powers are presented at the same scale to facilitate comparisons _ power and (i) the biological component of ankle joint power. Hip and knee 


across joints. a, Hip joint angle, (b) knee joint angle and (c) ankle joint power did not appear to change substantially across conditions, while biological 
angle. Joint angle trajectories did not appear to change substantially across ankle power showed trends detailed in Extended Data Fig. 5. Positive values 
conditions. d, Hip moment, (e) knee moment and (f) biological component of _ denote hip extension, knee extension and ankle plantarflexion with respect to 
ankle moment. Hip moment did not appear to change substantially across standing posture. N= 9. 


conditions, while knee moment and ankle moment showed trends detailed in 
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Extended Data Figure 8 | Centre-of-mass mechanics. a, The biological d, Average preload power, defined as the negative work performed during 
contribution to centre-of-mass power for each individual limb, defined as the _mid-stance divided by stride time. e, Average push-off power, defined as the 
dot product of ground reaction force with centre-of-mass velocity, both positive work performed during late stance divided by stride time. With 
determined from force plate data, minus the ankle exoskeleton power. increasing spring stiffness, the human contribution to push-off work decreased, 


b, Average collision power, defined as the negative work performed during the _ while the human contribution to rebound work increased substantially. N = 9; 
first half of stance divided by stride time. c, Average rebound power, defined thin lines, contralateral limb; bars, mean; error bars, s.e.m.; P values, two-factor 
as the positive work performed during mid-stance divided by stride time. ANOVA (random effect: participant; fixed effect: spring stiffness). 
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Extended Data Table 1 | Passive ankle exoskeleton mass by component 


Segment US Size8 US Size 13 
a Foot 130 g 160g 
oe Joints 40g 40g 
ge al 105g 170g 
Frame Mass 275g 370g 
Average Spring 76g 76g 
Mechanical Clutch 579 57g 
Total Mass 408 g 503 g 
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Extended Data Table 2 | Comparison of ankle exoskeleton masses 


Mass of 
Author Exoskeleton 
(per leg) 
Mooney et al."° 2,000 g 
Sawicki et al.°” 1,210 g* 
Malcolm et al.” 760 g* 
Passive Elastic 
(US size 13) ened 
Passive Elastic 408 g 


(US size 8) 


*Does not include tethered hardware. 
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Drug-based modulation of endogenous stem cells 
promotes functional remyelination in vivo 


Fadi J. Najm’, Mayur Madhavan, Anita Zaremba’, Elizabeth Shick’, Robert T. Karl', Daniel C. Factor’, Tyler E. Miller)?*, 
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Kyle R. Brimacombe®, Min Shen®, Matthew B. Boxer®, Ajit Jadhav’, Andrew P. Robinson’, Joseph R. Podojil?, Stephen D. Miller’, 


Robert H. Miller?+ & Paul J. Tesar>? 


Multiple sclerosis involves an aberrant autoimmune response and 
progressive failure of remyelination in the central nervous system. 
Prevention of neural degeneration and subsequent disability 
requires remyelination through the generation of new oligoden- 
drocytes, but current treatments exclusively target the immune 
system. Oligodendrocyte progenitor cells are stem cells in the cent- 
ral nervous system and the principal source of myelinating oligo- 
dendrocytes'. These cells are abundant in demyelinated regions of 
patients with multiple sclerosis, yet fail to differentiate, thereby 
representing a cellular target for pharmacological intervention’. 
To discover therapeutic compounds for enhancing myelination 
from endogenous oligodendrocyte progenitor cells, we screened a 
library of bioactive small molecules on mouse pluripotent epiblast 
stem-cell-derived oligodendrocyte progenitor cells*°. Here we 
show seven drugs function at nanomolar doses selectively to 
enhance the generation of mature oligodendrocytes from progen- 
itor cells in vitro. Two drugs, miconazole and clobetasol, are effec- 
tive in promoting precocious myelination in organotypic 
cerebellar slice cultures, and in vivo in early postnatal mouse pups. 
Systemic delivery of each of the two drugs significantly increases 
the number of new oligodendrocytes and enhances remyelination 
in a lysolecithin-induced mouse model of focal demyelination. 
Administering each of the two drugs at the peak of disease in an 
experimental autoimmune encephalomyelitis mouse model of 
chronic progressive multiple sclerosis results in striking reversal 
of disease severity. Immune response assays show that miconazole 
functions directly as a remyelinating drug with no effect on the 
immune system, whereas clobetasol is a potent immunosuppres- 
sant as well as a remyelinating agent. Mechanistic studies show that 
miconazole and clobetasol function in oligodendrocyte progenitor 
cells through mitogen-activated protein kinase and glucocorticoid 
receptor signalling, respectively. Furthermore, both drugs enhance 
the generation of human oligodendrocytes from human oligoden- 
drocyte progenitor cells in vitro. Collectively, our results provide a 
rationale for testing miconazole and clobetasol, or structurally 
modified derivatives, to enhance remyelination in patients. 

As repair of damaged myelin may provide therapeutic benefit in 
multiple sclerosis (MS) and other demyelinating disorders*”, we set 
out to identify drugs that could be re-purposed as remyelinating ther- 
apeutics. We selected the US National Institutes of Health (NIH) 
Clinical Collection I and II libraries comprising 727 drugs with a 
history of safe use in clinical trials, to test for maturation of oligoden- 
drocyte progenitor cell (OPCs) into myelinating oligodendrocytes. 


Using mouse epiblast stem cell (EpiSC)-derived OPCs, we developed 
an in vitro phenotypic screen that accurately quantified differentiation 
into mature oligodendrocytes by high content imaging of myelin pro- 
tein expression (Fig. 1a). 

Two batches (>100 million cells each) of pure OPCs were generated 
from independent mouse pluripotent EpiSC lines of opposite sex 
(Extended Data Fig. la). EpiSC-derived OPCs shared virtually all 
defining molecular and cellular properties, including gene expression 
profiles with in vivo isolated OPCs, but provided the key advantage of 
being highly scalable (Extended Data Fig. 1b)’. For in vitro screening, 
the seeding density, endpoint assays, and dimethylsulphoxide 
(DMSO) (vehicle) tolerance were optimized in pilot studies to assure 
accurate and reproducible measurement of OPC differentiation in a 
96-well format (Extended Data Fig. 1c). 

For the primary screen, OPCs were treated with vehicle alone 
(0.05% (v/v) DMSO) as a negative control, thyroid hormone (a known 
OPC differentiation inducer) as a positive control’’, or drug dissolved 
in DMSO at a concentration of 5 uM. After 72h, cells were fixed and 
labelled with antibodies to myelin basic protein (MBP) and the length 
and intensity of MBP labelled oligodendrocyte processes measured 
(Fig. 1a). These features were reliable indicators of alteration in cellular 
phenotype, as indicated by consistency and high signal to background 
ratio of positive and vehicle controls across all screening plates 
(Extended Data Fig. 1d—g). We then normalized the experimental data 
for the tested drugs against thyroid hormone (set value of 100) ona per 
plate basis. On the basis of this analysis, we identified the 22 drugs that 
enhanced oligodendrocyte formation greater than five standard devia- 
tions above DMSO treatment and outperformed thyroid hormone in 
the measured parameters (Fig. 1b). Notably, one of the top 22 drugs 
was benztropine, a muscarinic receptor antagonist recently shown to 
induce OPC differentiation and remyelination*”. 

To validate and prioritize the 22 drug hits, the assay was repeated 
using alternative OPCs, reagents, and parameters to eliminate screen- 
specific artefacts (see Methods). Drugs were ranked by their dose- 
dependent ability to induce oligodendrocyte generation from OPCs 
without toxicity (Extended Data Fig. 2a). To demonstrate reproducib- 
ility, an independent laboratory tested selected drug hits using distinct 
equipment, plate format (1,536-well), personnel, and imaging/analysis 
scripts (see Methods). Of the 16 hits tested at the external screening 
site, 14 were validated as potent inducers of oligodendrocyte differ- 
entiation (Extended Data Fig. 2a, b). 

We next tested whether the drug hits could promote the maturation 
of native OPCs in central nervous system (CNS) tissue. Cerebellar 
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Figure 1 | A pluripotent stem-cell-based phenotypic screening platform to 
identify modulators of OPC differentiation and maturation. 

a, Representative images of vehicle- and drug-hit-treated mouse EpiSC-derived 
OPCs from the primary screen. Nuclear (DAPI (4’,6-diamidino-2- 
phenylindole), blue) and MBP (red) staining along with HCA to identify 
oligodendrocyte (oligo.) nuclei (green) and MBP* processes (yellow). Scale bar, 
100 1m. b, Scatter plot of primary screen results displayed as normalized values 
of MBP process length and intensity for all 727 drugs with the 22 hits marked in 
red. Baseline (vehicle) was set at zero and thyroid hormone (positive control) 
was set at 100. c, Montaged images of whole postnatal day 7 mouse cerebellar 
slices treated with drug or vehicle for 5 days and stained for MBP (green). Insets 
show a representative example of the HCA script used to identify and quantify 
MBP*-aligned fibres (light blue). Scale bars, 1 mm for whole slices and 100 um 
for insets. d, Relative quantification of HCA and western blot data from 
cerebellar slices treated for 5 days. For HCA screen, n = 1 with 6-12 slices 
averaged per group (also see Extended Data Fig. 2a). For western blot, n = 3 
independent replicates of 12 slices per group. Values are mean for HCA and 
mean = s.e.m. for western blot. e, Representative western blot of MBP isoforms 
and f-actin (loading control) of cerebellar slices treated for 5 days. Full blots are 
available in Supplementary Fig. 1. f, Chemical structures of clobetasol and 
miconazole. Source data are provided for Fig. 1b, d. 


slices were generated from mice at postnatal day 7—a time that pre- 
cedes widespread myelination—and treated ex vivo with drug or 
DMSO (vehicle) for 5 days and labelled with anti- MBP antibodies 
(Fig. 1c)°’. We screened 11 of the top drugs and used a high content 
analysis (HCA) algorithm developed in house to rank them on the 
basis of their ability to increase the extent of MBP” aligned fibres in 
whole cerebellar slices. The ‘high’ performing group consisted of four 
drugs that increased the number of MBP* aligned fibres ~150% or 
greater (Fig. 1d and Extended Data Fig. 2a). We validated the accuracy 
of our high content screen by semi-quantitative western blotting of 
MBP protein isoforms in independent slice culture experiments 
(Fig. 1d, e)'*”°. 

Analysis of structure-activity relationships revealed that the top hits 
from the primary screen segregated into two specific classes containing 
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either a 1,3-diazole with mono-substitution at the 1-position or a 
sterane base structure (Extended Data Fig. 3a—d). We selected mico- 
nazole and clobetasol, the top overall performing hits in each of the 
imidazole and sterane classes respectively, for further mechanistic and 
functional testing after confirming that both drugs readily crossed the 
blood-brain barrier in mice (Fig. 1f, Extended Data Fig. 2a and 
Supplementary Table 1). Miconazole is a topical antifungal agent func- 
tioning through cytochrome P450 inhibition, and clobetasol is a potent 
topical corticosteroid, but their functions in OPCs were unknown. 

To test whether miconazole or clobetasol enhance remyelination in 
vivo, we used a toxin-induced model whereby focal demyelinated 
lesions are generated in dorsal white matter of the spinal cord of adult 
mice by localized injection of lysolecithin (lysophosphatidylcholine 
(LPC)). In lesioned animals, demyelination is complete within 4 days, 
after which OPCs are recruited into the lesion. Widespread remyelina- 
tion does not normally start until 14-21 days post lesion (d.p.1.), which 
provides a defined window from days 4 to 14 to test the efficacy of 
drugs to enhance the extent and rate of remyelination’®. Both mico- 
nazole (10 or 40 mg per kg (body weight)) and clobetasol (2 mg per kg) 
treatment induced a marked improvement within the lesions of treated 
mice compared with vehicle-treated controls. At 8 d.p.l. both drugs 
induced a significant increase in the number of newly generated CC1~ 
oligodendrocytes in the lesion core (Fig. 2a, b). This was coincident 
with extensive MBP staining in the lesions of miconazole- and clobe- 
tasol- but not vehicle-treated animals at both 8 and 12 d.p.l. (Fig. 2a). 
Electron micrographs and tissue sections stained with toluidine blue 
demonstrated that miconazole and clobetasol each induced a striking 
increase in the extent of remyelination (Fig. 2c, d and Extended Data 
Fig. 4a, b). At 12 d.pl., lesions of vehicle-treated mice consisted 
mostly of unmyelinated axons (6% myelinated) while those of clobe- 
tasol- and miconazole-treated mice contained >70% remyelinated 
axons throughout the extent of the lesion (Fig. 2d). Analysis of myelin 
thickness relative to axon diameter (g ratio) at 12 d.p.l. revealed that 
miconazole- and clobetasol-induced myelin was thinner than intact 
myelin, a defining characteristic of remyelination (Fig. 2d). 

We also evaluated whether miconazole or clobetasol could promote 
precocious myelination during development, in the absence of injury 
or disease. We treated mice at postnatal day 2—a time point that 
precedes widespread CNS myelination—daily for 4days with drug 
or vehicle. In miconazole- and clobetasol-treated mice, we found a 
significant increase in the number of CC1* oligodendrocytes in the 
lateral corpus callosum compared with vehicle-treated mice (Extended 
Data Fig. 5a). Additionally, we found a significantly larger portion of 
the corpus callosum was populated by MBP* fibre tracts in micona- 
zole- and clobetasol-treated mice (Extended Data Fig. 5b). This sug- 
gests that clobetasol and miconazole enhance myelination in the 
absence of damage or disease. Collectively, the LPC demyelination 
and developmental mouse models demonstrate that miconazole and 
clobetasol each function to induce the differentiation of endogenous 
OPCs in the CNS and promote enhanced myelination. 

To determine whether the drugs were working at a particular stage 
of the OPC differentiation process, we seeded OPCs in differentiation 
conditions and treated them with either miconazole or clobetasol at 
different time points (0, 16, 24, or 48h), and assayed MBP expression 
at 72h. For both miconazole and clobetasol, the number of MBP* 
oligodendrocytes present at 72h was dependent on drug treatment 
within the first 24h of differentiation (Fig. 3a). In agreement with 
these data, treatment of differentiating OPCs with either drug for 
different durations (24, 48, 56, and 72h) induced a progressive, 
time-dependent increase in the number of MBP* oligodendrocytes 
(Fig. 3b). These data suggest that both drugs function directly on OPCs 
early in the differentiation process. Additionally, neither drug showed 
a significant impact on astrocyte formation from OPCs in vitro, sug- 
gesting they probably function as direct inducers of oligodendrocyte 
differentiation (Fig. 3c). 
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Figure 2 | Miconazole and clobetasol each enhance remyelination in the 
LPC lesion mouse model. a, Representative immunohistochemical images of 
treated mice showing newly generated oligodendrocytes (CC1, red) and MBP 
(green) within the lesion (approximated by white dashed outline) at eight and 
12 d.p.l. Scale bar, 200 tm. b, Quantification of ccl1* oligodendrocytes per 
lesion area at 8 d.p.l. Values are mean + s.e.m.; n = 3 mice per group. Two-tailed 
t-test, *P < 0.05. c, Representative electron micrographs showing remyelinated 
axons within lesions of drug-treated mice at 12 d.p.l. Scale bar, 2 jum. d, Scatter 
plot of g ratios of lesion axons at 12 d.p.1.; n = 100 calculated from two mice per 
group compared to wild-type intact axons. Percentage of lesion axons 
myelinated is indicated in the legend. Source data are provided for Fig. 2b, d. 


Muscarinic receptor antagonists such as benztropine and clemas- 
tine have recently been identified as remyelinating agents*”. Therefore 
we tested whether miconazole or clobetasol function through the mus- 
carinic acetylcholine pathway using functional cellular reporter assays 
of all muscarinic receptor subtypes (M1-M5). Neither miconazole nor 
clobetasol inhibited any of the five muscarinic receptor subtypes 
(Fig. 3d). We then profiled whether clobetasol or miconazole bio- 
chemically inhibited the activity of 414 different kinase isoforms. 
Neither clobetasol nor miconazole inhibited any of the kinases tested, 
suggesting their activity is not based on direct inhibition of protein 
kinases (Supplementary Table 2). 
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Figure 3 | Cellular and molecular effects of miconazole and clobetasol on 
mouse OPCs. a, Percentage MBP” oligodendrocytes generated from OPCs 
at 72h with treatments initiated at time points indicated; n = 6 wells per 
condition with >6,000 cells scored per well. b, Percentage MBPT 
oligodendrocytes generated from OPCs treated simultaneously and analysed at 
time points indicated; n = 8 wells per condition with >1,700 cells scored per 
well. c, Percentage GFAP” astrocytes generated from OPCs at 72h of 
treatment; n = 4 wells per condition >2,900 cells scored per well. d, Heat map 
depicting biochemical inhibition of muscarinic receptors M1-M5 displayed as 
percentage inhibition with minimum (green) and maximum (red). e, Western 
blot of total glucocorticoid receptor and its phosphorylation at Ser220 (p-GR) 
in OPCs treated for 1h. f, Percentage MBP™ oligodendrocytes generated from 
OPCs 72h after treatment; n = 6 wells per condition with >1,400 cells scored 
per well. g, Western blot of total ERK1/2 and their phosphorylation at Thr202/ 
Tyr204 or Thr185/Tyr187 (p-ERK1/2) in cells (OPCs or mouse embryonic 
fibroblasts) treated for 1h. FGF served as a positive control for p-ERK1/2 
induction. h, Western blot of total ERK1/2 and p-ERK1/2 in OPCs treated for 
1h in the presence of the indicated pathway inhibitors. All graphs depict 
mean + s.e.m. Full western blots are available in Supplementary Fig. 2. Source 
data are provided for Fig. 3a-d, f. 


To explore the signalling pathways in OPCs influenced by these 
drugs, we performed genome-wide RNA sequencing and phosphopro- 
teomic analyses on mouse OPCs treated with drug or vehicle 
(Extended Data Fig. 6a—c and Supplementary Table 3). Miconazole 
or clobetasol treatment altered OPC transcript expression and phos- 
phoproteins within hours, and influenced expression of genes in sig- 
nalling pathways involved in oligodendrocyte maturation and 
myelination. Clobetasol potently modulated genes downstream of 
multiple nuclear hormone receptors, including glucocorticoid recep- 
tor, which are known to be important regulators of myelin gene 
expression’*"”. Since glucocorticoid receptor signalling is also known 
to enhance Schwann-cell-mediated myelination in the peripheral 
nervous system’*, we tested whether the activity of clobetasol on 
OPCs was mediated by glucocorticoid receptor signalling. Treatment 
of OPCs with clobetasol for 1 h increased the phosphorylation of glu- 
cocorticoid receptor at Ser220, an activating post-translational modi- 
fication (Fig. 3e). RU486, a competitive glucocorticoid receptor 
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antagonist, blocked clobetasol-induced glucocorticoid receptor phos- 
phorylation and oligodendrocyte differentiation (Fig. 3e, f) suggesting 
that the activity of clobetasol in OPCs is mediated through the gluco- 
corticoid receptor signalling axis. 

For miconazole, pathway analyses showed that proteins in the mito- 
gen-activated protein (MAP) kinase pathway were most strongly affec- 
ted (Extended Data Fig. 7a, b and Supplementary Table 3). Most 
prominent was the strong and sustained phosphorylation of both extra- 
cellular signal-regulated kinases ERK1 and ERK2 (ERK1/2) at canon- 
ical activation sites, which we validated by western blotting (Fig. 3g). In 
mice, genetic loss of ERK1/2 in the oligodendrocyte lineage results 
in normal numbers of OPCs and oligodendrocytes but widespread 
hypomyelination, while constitutive activation of ERK1/2 results in a 
profound increase in the extent of remyelination after toxin-induced 
demyelinating injury'’”’. In contrast to miconazole, treatment of OPCs 
with clobetasol or benztropine did not induce ERK1/2 phosphorylation 
(Fig. 3g). Miconazole treatment of a non-neural cell type, mouse fibro- 
blasts, also showed no increase of ERK1/2 phosphorylation, indicating 
potential cell-type specificity (Fig. 3g). PD0325901, a small molecule 
inhibitor of ERK’s upstream MAP-kinase kinase (MEK), blocked the 
ability of miconazole to induce ERK1/2 phosphorylation, suggesting 
that miconazole functions through a MEK-dependent mechanism in 
OPCs (Fig. 3h). We also treated mouse OPCs with voriconazole, a 
triazole-containing antifungal cytochrome P450 inhibitor with 80% 
structural similarity to miconazole, which failed to induce changes in 
ERK1/2 phosphorylation (Fig. 3g). This was consistent with the obser- 
vation that voriconazole did not promote the differentiation of OPCs 
into oligodendrocytes (Extended Data Fig. 7c). Taken together, these 
results suggest that the effect of miconazole on OPCs is independent of 
cytochrome P450 inhibition. 

We then assessed whether clobetasol and miconazole treatment 
would enhance the differentiation of human OPCs into oligodendro- 
cytes. We generated human OPCs from human embryonic stem cells 
(hESCs) and human-induced pluripotent stem cells (hiPSCs) 
(Extended Data Fig. 8a-c)*’”*. We then treated human OPCs with 
DMSO, clobetasol, or miconazole for 21 days followed by staining 
for MBP, imaging, and HCA (Extended Data Fig. 8d-g). Both drugs 
enhanced human OPC differentiation, with miconazole exhibiting the 
most reproducible and potent effects. 

To interpret the potential impact of clobetasol or miconazole as 
therapeutics in immune-mediated MS models, we tested effects on 
immune cell survival and function. We found that only clobetasol, 
as expected from its known corticosteroid properties, altered naive 
T-cell differentiation and both the proliferation and secretion of cyto- 
kines by proteolipid protein (PLP}39-151)- or myelin oligodendrocyte 
glycoprotein (MOG35_55)-sensitized lymph node cells (Extended Data 
Fig. 9a-j). As such, only clobetasol, but not the solely remyelinating 
drugs miconazole or benztropine, showed efficacy in reducing disease 
severity in the immune-driven relapsing—remitting PLP, 39_15) experi- 
mental autoimmune encephalomyelitis (EAE) model (Fig. 4a). The 
positive effect of clobetasol in this model resulted from its immuno- 
suppressive effects as evidenced by the severe reduction of T cells 
within the spleen (Fig. 4b). 

We also used a second EAE mouse model, MOG35_55-induced, in 
which the immune response was relatively controlled and disease 
pathology recapitulated chronic progressive demyelination. We used 
a therapeutic, rather than prophylactic, treatment regimen to evaluate 
whether drugs could reverse, rather than prevent, disease. Miconazole- 
and clobetasol-treated animals all exhibited a marked improvement in 
function, with nearly all animals regaining use of one or both hind 
limbs (Fig. 4c, d). In contrast, vehicle-treated mice exhibited chronic 
hindlimb paralysis over the treatment period. Benztropine treatment 
also resulted in functional improvement, but to a lesser extent than 
miconazole and clobetasol (Fig. 4c, d). Overt functional recovery of 
miconazole- and clobetasol-treated mice correlated with histological 
improvements in the spinal cord. Specifically, drug-treated mice 
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Figure 4 | Therapeutic efficacy of miconazole and clobetasol in mouse 
models of MS. a, Scoring of disease severity in relapsing remitting PLPj39_151- 
induced EAE mice treated beginning on day 13 (black arrow) and ending on 
day 29; n = 10 mice per group. Graph depicts mean daily disease score + s.e.m. 
b, Flow-cytometric-based quantification of spleen cell numbers at day 29 from 
the PLP}39_15; EAE cohort in a. Values are mean + s.e.m.; n = 4 or 5 mice per 
group. c, Scoring of disease severity in chronic progressive MOG35_55-induced 
EAE mice treated daily for 10 days beginning at the peak of disease on day 15 
(black arrow); n = 12-16 mice per group. Graph depicts mean daily disease 
score + s.e.m. d, Mean improvement in disease score per animal (peak score 
minus ending score) of MOG;;_55 EAE cohort in c. Also shown are external 
validation results in MOG35_;, EAE from an independent contract laboratory. 
n= 12 mice per group. For all EAE experiments, drugs were dosed daily by 
intraperitoneal injection: clobetasol (2 mg/kg), miconazole (10 mg/kg), 
benztropine (10 mg/kg), or FTY720 (1 mg/kg). All EAE disease scoring was as 
follows: 0, no abnormality; 1, limp tail; 2, limp tail and hind limb weakness; 3, 
hind limb paralysis; 4, hind limb paralysis and forelimb weakness; and 5, 
moribund. Two-tailed t-test, *P < 0.05 and **P < 0.01 for drug-treated groups 
compared with their respective vehicle-treated group. Source data are provided 
for Fig. 4a—d. 


showed restoration of MBP expression and a reduction in the extent 
of demyelination in the spinal cord, whereas vehicle-treated mice 
showed sustained areas of white matter disruption (Extended Data 
Fig. 10a-e). 

Although the immunosuppressive effect of clobetasol makes it chal- 
lenging to evaluate its remyelinating potential in EAE directly, its 
consistent and robust induction of OPC differentiation in vitro, and 
enhancement of remyelination in non-immune-driven in vivo assays, 
suggests that it serves a role in both immunomodulation and pro- 
motion of myelination. In contrast, miconazole did not modulate 
immune cell function and our data indicate that it acts as a direct 
remyelinating agent. Given the potential of miconazole as a remyeli- 
nating therapeutic, we contracted a separate laboratory to provide 
independent validation of its efficacy in the MOG35_55-induced EAE 
preclinical model. The laboratory independently validated the precli- 
nical efficacy of miconazole in MOG35_55-induced EAE to reduce 
disease severity in treated mice (Fig. 4d). 

Since the approval in 1993 of interferon (IFN)-B-1b for the treat- 
ment of MS, therapeutic development has centred on the generation of 
additional immunomodulatory agents. Despite the effectiveness of 
many of these drugs to modulate CNS inflammation in patients with 
MS, none of them prevent chronic progressive disease and disability— 
largely because of their inability to stop or reverse the failure of remye- 
lination in the CNS. We developed an advanced high throughput 
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screening platform to discover effective remyelinating therapeutics. 
This pluripotent stem-cell-based system provides unprecedented scal- 
ability, purity, and genotypic flexibility to screen for compounds 
that enhance OPC differentiation and myelination. Using this plat- 
form we identified two drugs approved by the US Food and Drug 
Administration, miconazole and clobetasol, with newly discovered 
functions to modulate OPC differentiation directly, enhance remyeli- 
nation, and significantly reduce disease severity in mouse models of 
MS. Since miconazole and clobetasol are currently only approved for 
topical administration in humans, significant optimization of dosing, 
delivery, and potentially chemical structure will be required to enhance 
the on-target pharmacology in OPCs while diminishing any potential 
off-target side effects. However, the ability of miconazole and clobeta- 
sol to cross the blood-brain barrier raises the exciting possibility that 
these drugs, or modified derivatives, could advance into clinical trials 
for the currently untreatable chronic progressive phase of MS. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Mouse OPC preparation. OPCs used in this study were generated from two 
separate EpiSC lines, EpiSC9 (female) and 12901 (male), using in vitro differenti- 
ation protocols and culture conditions described previously***. Cultures were 
regularly tested and shown to be mycoplasma free. To ensure uniformity through- 
out all in vitro screening experiments, EpiSC-derived OPCs were sorted to purity 
by fluorescent activated cell sorting at passage five with conjugated CD140a-APC 
(eBioscience, 17-1401; 1:80) and NG2-AF488 (Millipore, AB5320A4; 1:100) anti- 
bodies. Sorted batches of OPCs were expanded and frozen down in aliquots. OPCs 
were thawed into growth conditions for one passage before use in screening assays. 
In vitro phenotypic screening of OPCs. EpiSC-derived OPCs were seeded onto 
poly-p-lysine 96-well Viewplate or CellCarrier plates (PerkinElmer) coated with 
laminin (Sigma, L2020; 10 pg ml’) using electronic multichannel pipetors. For 
the primary screen, 30,000 cells were seeded per well in screening medium 
(DMEM/F12 supplemented with N2 (R&D Systems), B-27 (Life Technologies), 
neurotrophin 3 (R&D Systems; 10 ng ml '), cAMP (Sigma; 50 uM), IGF-1 (R&D 
Systems; 100ngml~'), noggin (R&D Systems; 100ngml~')) and allowed to 
attach for 2h before addition of drug. NIH Clinical Collection I and II (http:// 
www.nihclinicalcollection.com) drugs were added to assay plates with 0.1 pl pin 
replicators (Molecular Devices, Genetix; X5051), resulting in a final primary 
screening concentration of 5uM. Thyroid-hormone-positive controls and 
DMSO vehicle controls were included in each assay plate. Cells were incubated 
under standard conditions (37 °C, 5% CO) for 3 days and fixed with 4% para- 
formaldehyde (PFA) in phosphate buffered saline (PBS). Fixed plates were per- 
meabilized with 0.1% Triton X-100 and blocked with 10% donkey serum (v/v) in 
PBS for 2h. Cells were labelled with MBP antibodies (Abcam, ab7349; 1:100) for 
1h at room temperature (~22°C) followed by detection with Alexa Fluor- 
conjugated secondary antibodies (1:500) for 45 min. Nuclei were visualized by 
DAPI staining (Sigma; 1 pg ml’). All plates for the primary screen were processed 
and analysed simultaneously to eliminate variability. Donepezil was identified 
in the primary screen; however, the drug was not available at the time of dose- 
response testing and was excluded from further testing. 

Dose-response testing of drug hits followed the same procedure with the fol- 
lowing modifications to eliminate any artefacts in the primary screen: indepen- 
dently sourced drugs; a distinct batch of EpiSC-derived OPCs from a mouse of 
opposite sex; multi-dose testing; cytotoxicity analysis; an alternative marker of 
mature oligodendrocytes proteolipid protein 1 (PLP1, antibody clone AA3 pro- 
vided by B. Trapp; 1:5,000); and an alternative high content assay endpoint para- 
meter (percentage of oligodendrocytes differentiated instead of process intensity 
and length parameters). All drugs were tested in quadruplicate at seven different 
doses (ranging from 333 nM to 6.7 |1M) and classified into tiers on the basis of their 
half-maximum effective concentration (ECs9) to induce OPC maturation, and 
their toxicity (concentration at which 50% of the cells were lost). Tier A drugs 
(n = 3) consisted of nanomolar dose effectors with little to no detectable toxicity at 
doses tested. Tier B drugs (n = 4) showed nanomolar effects but demonstrated 
toxicity at high doses. Tier C and D drugs required high doses to see an effect, 
demonstrated toxicity at low doses, or failed to show a dose-dependent response. 
HCA of in vitro screen. For the 5M in vitro screen, stained plates were imaged 
on the Opera confocal imaging system (PerkinElmer) and a set of 24 X10 fields 
were collected from each well, resulting in an average of 10,000 cells being scored 
per well. For the dose-response (6.7 1M, 5 11M, 3.3 11M, 1.7 uM, 666 nM, 500 nM, 
and 333 nM) in vitro assays, plates were imaged on the Operetta High Content 
Imaging and Analysis system (PerkinElmer) and a set of 14 X20 fields captured 
from each well resulting in an average of 3,300 cells being scored per well. Analysis 
(PerkinElmer Acapella, Harmony, and Columbus software) began by identifying 
intact nuclei stained by DAPI; that is, those traced nuclei that were larger than 
50 jum’ in surface area and possessed intensity levels that were typical and less than 
the threshold brightness of pyknotic cells. Each traced nucleus region was then 
expanded by 50% and cross-referenced with the mature myelin protein (MBP or 
PLP1) stain to identify oligodendrocyte nuclei, and from this the percentage of 
oligodendrocytes was calculated. Processes emanating from oligodendrocyte 
nuclei were identified using the CSIRO2 analysis module within a custom 
Acapella script. Maximum mean process length (denoted ‘process length’) and 
mean process intensity (denoted ‘process intensity’) were generated on a per well 
basis. For the 5 1M in vitro screen, values were calculated and normalized to 100 
for thyroid hormone (positive control)-treated wells and to 0 for DMSO (vehicle)- 
treated wells, on a per plate basis. 

Phenotypic validation testing of OPCs. Briefly, OPCs were grown and expanded 
in laminin-coated flasks before harvesting for plating. Cells were dispensed in 
screening media (see above for details) using a Multidrop Combi dispenser 
(Thermo Fisher) into laminin/poly-L-ornithine-coated sterile, 1,536-well, black 
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clear-bottom tissue culture plates (Brooks Automation), to a final density of 
2,000 cells per well. Plates were sealed with gasketed stainless steel lids with holes 
for gas exchange (Wako USA). Following cell attachment, library compounds 
were transferred by pintool (Wako USA) using 10 nl slotted pins. Library com- 
pounds were serially diluted in DMSO, and were added to plates to yield final 
concentrations of 0 (DMSO only), 4, and 20 .M compound. After incubation for 
72 hat 37 °C, cells were fixed, washed, and stained similar to the 96-well OPC assay 
protocol, although all aspiration steps were performed using a Biotek EL406 
Microplate Washer Dispenser (Biotek) equipped with a 1,536-well aspiration 
manifold. Dispense steps were performed with both peristaltic pump cassettes 
(for gentle reagent additions) and syringe pump manifolds (for faster bulk dis- 
penses). Cells were stained with DAPI (Sigma; 1 pg ml!) and MBP antibody 
(Abcam, ab7349; 1:100). Plates were then imaged using an InCell 2000 Analyzer 
High Content Imager (GE Healthcare Bio-Sciences). Well images were analysed 
using InCell Analyzer Workstation software, and the MBP signal was quantified 
with a process detection algorithm, using total process skeleton length to qualify 
activity. 

Ex vivo cerebellar slice cultures. Whole cerebellum was collected from C57BL/6 
mice at postnatal day 7 and embedded in agarose. Sagittal slices were cut on a 
microtome (Leica) at 300 im. Slices were cultured in a DMEM-Basal Medium 
Eagle’s base with 15% heat-inactivated horse serum, modified N2, and PDGF-AA. 
After 1 day in culture, slices were treated daily for 5 days with test drugs or vehicle 
(DMSO). Drugs tested were clobetasol (5 uM), hydroxyzine (5 uM), clotrimazole 
(2 uM), miconazole (1 uM), ketoconazole (1 4M), vesamicol (5 1M), propafenone 
(24M), dicyclomine (541M), benztropine (2M), haloperidol (511M), and 
medroxyprogesterone (5 1M). The identity of the drugs was blinded to the experi- 
menter. Slices were then lysed for western blot or fixed in 4% PFA and processed 
for HCA as detailed below. 

Immunohistochemistry. Immunohistochemistry was performed as previously 
described’. In short, tissue sections or whole slices were washed three times in 
PBS, blocked in PBS containing Triton X-100 (0.1%) and normal donkey serum 
(NDS, 2% for sections and 10% for cerebellar slices) and incubated with primary 
antibody overnight. For MBP immunohistochemistry, the primary antibody solu- 
tion consisted of 2% NDS, 2% bovine serum albumin, and 0.1% saponin. For all 
other antibodies, the primary antibody solution consisted of 2% NDS and 0.1% 
Triton X-100. Primary antibodies used included rat anti-MBP (Abcam, ab7349; 
1:100), mouse anti-APC CC] clone (Millipore, MABC200; 1:500), and rabbit anti- 
IBA1 (Wako Chemicals, 019-19741; 1:1,000). The tissue was then washed in PBS 
and incubated in secondary antibodies for 2 h. Secondary detection was performed 
with Alexa Fluor-conjugated secondary antibodies (1:500) for 1 h. Luxol fast blue 
staining was performed as previously described’. 

High content screen of cerebellar slices. MBP-stained cerebellar slices were 
analysed by confocal image on an Operetta system using PhenoLOGIC 
machine-learning technology within Harmony software. The software was trained 
to identify elongated fibres more characteristic of axonal ensheathment and to 
exclude regions of small fibres or diffuse background fluorescence on the basis of 
texture features. MBP-positive surface area was collected and normalized to the 
total surface area for the group of slices treated with each drug. A minimum of six 
slices were treated per drug, which included an equal distribution of medial and 
lateral slices. 

Western blotting of cerebellar slices. Cerebellar slices (each biological replicate 
using 12 slices per condition; six each from two separate animals) were collected in 
PBS and centrifuged. The PBS was aspirated and the pellet resuspended in 100 tL 
lysis buffer (20 mMTris, 137 mM NaCl, 5.0mM EDTA pH 8.0, 10% glycerol, 1% 
NP40, pH to 8.0 with HCl), incubated on ice for 20 min, centrifuged, and the 
supernatant collected. Protein concentration was determined by a Pierce BCA 
protein assay kit (Thermo Fisher). Equal amounts of protein were applied to 
NuPAGE 12% Bis-TRIS gels (Life Technologies), and electrophoretically trans- 
ferred onto a PVDF membrane (Life Technologies). The membranes were incu- 
bated with rabbit anti- MBP (Millipore, AB980; 1:500) and consequently probed 
with horseradish peroxidase (HRP)-conjugated goat anti-rabbit (1:5,000) or incu- 
bated with HRP-conjugated mouse anti-f-actin (Sigma, A3854; 1:10,000) to 
ensure even loading of samples. Enhanced chemiluminescence was performed 
with a West Pico kit (Thermo Fisher) and relative optical density was measured 
using Image] (NIH). 

Chemoinformatics. Structure-activity searches of azoles and steranes were per- 
formed with Canvas program (Schrodinger Software, release 2014-1: Canvas, 
version 1.9). Tanimoto similarity between voriconazole and miconazole was cal- 
culated by ROCS (OpenEye Scientific Software). 

Pharmacokinetics. C57BL/6 adult female mice were dosed intraperitoneally with 
miconazole (10 mg/kg or 40 mg/kg) or clobetasol (10 mg/kg). After 1 or 6h, 100 pl 
of plasma was collected then each animal was perfused with PBS. Brains were 
collected, weighed, and rinsed with PBS. Water (0.5 ml) was added to the brain 
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samples, which were then homogenized. Plasma and brain samples were each 
diluted fivefold with blank rat plasma. Three hundred microlitres of internal 
standard solution was added to the samples, vortexed, and centrifuged. Five 
microlitres of each sample was injected into an API-4000Qtrap mass spectrometer 
and quantified (Climax Labs). 

Focal demyelination and drug treatment. Focal demyelination in the spinal cord 
was induced by the injection of 1% LPC solution. Ten- to 12-week-old C57BL/6 
female mice were anaesthetized using isoflurane and a T10 laminectomy was 
performed. One microlitre of 1% LPC was infused into the dorsal column at a 
rate of 15 mlh'. The animals were euthanized either at day 8 or day 12 after the 
laminectomy (n = 6-9 per group). Animals that were euthanized at day 8 received 
vehicle or drug daily by intraperitoneal injection between days 3 and 7. Animals 
used in the day 12 experiments received vehicle or drug daily by intraperitoneal 
injection between days 4 and 11. Drugs were dissolved in DMSO and then diluted 
with sterile saline for injection. Mice were deeply anaesthetized using ketamine/ 
xylazine rodent cocktail and then euthanized by transcardial perfusion with 4% 
PFA for histological analysis or 4% PFA, 2% gluteraldehyde, and 0.1 M sodium 
cacodylate for electron microscopy. PFA fixed tissue was equilibrated in 30% 
sucrose, embedded in OCT, and cryosectioned at 20 um thickness and processed 
for CC1 and MBP immunohistochemistry. Image] was used to measure area of the 
lesion and CC1° cells within the lesion were scored manually. For CC1 scoring, 
sections were taken from the centre of each lesion to control for lesion variability. 
Electron microscopy. Samples were processed as previously described?**. In 
short, samples were osmicated, stained en bloc with uranyl acetate, and embedded 
in EMbed 812, an Epon-812 substitute (EMS). Sections (1 um) were cut and 
stained with toluidine blue and visualized on a light microscope (Leica 
DM5500B). Additional thin sections were cut, carbon-coated and imaged either 
ona JEOL JEM-1200-EX electron microscope or a T12 electron microscope (FEI). 
Developmental myelination. Mouse pups of strain CD1 were administered 2 mg/ 
kg clobetasol, 10 mg/kg miconazole, or vehicle (DMSO in saline) by daily intra- 
peritoneal injections from postnatal day 2-6. Some clobetasol treated animals 
exhibited sickness on the basis of this treatment, with low body weight, and some 
animals of the cohort died before end of treatment. On postnatal day 6 the pups 
were anaesthetized using ketamine and xylazine and euthanized by transcardial 
perfusion with 4% PFA. Tissue was fixed overnight in 4% PFA, equilibrated in 30% 
sucrose, and embedded in OCT. Sections (20 jm) were cut and processed for CC1 
and MBP immunohistochemistry. ImageJ was then used to count and measure 
area of the corpus callosum as well as measure the extent of the corpus callosum 
length covered by MBP processes. Eight coronal sections containing corpus 
callosum rostral to the hippocampus from at least three animals per group were 
used for these analyses. To quantitate extent of MBP in the corpus callosum, a 
line was drawn through the centre of the corpus callosum from the lateral tip to 
the dorsal most extent of MBP expression in the corpus callosum. The length of 
this line was measured and then the dorsal-most point of the line was extended to 
the dorsal tip of the corpus callosum and measured to yield the length of the lateral 
callosum. The two numbers were divided to get the MBP/corpus callosum pro- 
portion. A two-tailed t-test was used to compare drug- with vehicle-treated groups. 
Muscarinic receptor antagonism. Miconazole, clobetasol, and benztropine (all at 
1M in DMSO) were sent to Select Screen (Life Technologies) with identities 
coded. GeneBLAzer or Tango assays were performed to determine level of acet- 
ylcholine muscarinic receptor M1, M3, M5 (GeneBLAzer), M2 and M5 (Tango) 
antagonism. 

Kinase profiling. LanthaScreen, Z'-LYTE, and Adapta kinase assays were per- 
formed by Select Screen (Life Technologies). LanthaScreen Eu kinase assays were 
performed in Greiner low-volume 384-well plates. Assay buffer consisted of 
50mM HEPES pH 7.5, 0.01% BRIJ-35, 10 mM MgCl, 1mM EGTA. Each well 
consisted of this mixture: 4.0 pl of 4 UM test drug in assay buffer, 8 pl of 2X kinase/ 
Eu antibody mixture, and 4 ul of 4X Alexa Fluor 647 tracer. Plates were incubated 
for 60min at room temperature (~22°C), then Alexa Fluor 647 emission 
(665 nm) and Europium emission (615 nm) read on a fluorescent plate reader. 
Data were analysed by generating the emission ratio (665 nm/615 nm) for each test 
point and normalizing 0% to control wells with no known inhibitor and 100% to 
control wells with highest concentration of known inhibitor. 

Z'-LYTE assays were performed in Corning, low volume 384-well plates. Assay 
buffer consisted of 50 mM HEPES pH 7.5, 0.01% BRIJ-35, 10mM MgCh, 1mM 
EGTA. Each well consisted of this mixture: 2.5 1] 4X of 4 1M drug in assay buffer, 
5 tl 2X peptide/kinase mixture, 2.5 pl 4X ATP solution. Plates were then incu- 
bated at room temperature (~22°C) for 60min. Then 5 ul of a development 
reagent that contained a protease that selectively digested the non-phosphorylated 
peptide was added and the plates incubated for 60min. Coumarin emission 
(445 nm) and fluorescein emission (520nm) were read on a fluorescent plate 
reader. Data were analysed by normalizing out background fluorescence then 
generating the emission ratio (445nm/520nm) for each test point. Data were 


further normalized to 0% in control wells with no ATP and 100% in control wells 
with synthetically phosphorylated peptide of the same sequence. 

Adapta assays were performed in Corning, low volume 384-well plates. Assay 
buffer consisted of 30 mM HEPES. Each well consisted of this mixture: 2.5 p14 of 
4uM drug in assay buffer, 2.5 ul 4X ATP solution, 5 pl 2X substrate/kinase 
mixture. Plates were then incubated at room temperature (~22 °C) for 60 min. 
Then 5 pl of a development reagent that contained europium-anti-ADP antibody 
and ADP tracer were added and the plate incubated for 60 min. Alexa Fluor 647 
emission (665 nm) and europium emission (615 nm) were read on a fluorescent 
plate reader. Data were analysed by generating the emission ratio (665 nm/ 
615nm) for each test point and normalizing 0% to control wells with no ATP 
in the kinase reaction and 100% to control wells with ADP. 

Raw data from all kinase assays can be found in Supplementary Table 2. 
HCA of astrocyte induction. For the astrocyte experiments in Fig. 3c, the experi- 
mental setup was identical to the PLP1-based primary validation screen except 
plates were stained for GFAP (DAKO, Z0334; 1:5,000). BMP4 (R&D Systems; 
50 ng ml!) and LIF (Millipore; 10° U ml!) were used as the positive control for 
astrocyte induction. Assay plates were imaged on the Operetta High Content 
Imaging and Analysis system and a set of 14 X20 fields captured. Columbus 
Data Management and Analysis System software (PerkinElmer) was used to 
quantify the percentage of GFAP* astrocytes in each well using a method similar 
to that developed for oligodendrocytes. 

Global phosphoproteomics. Quantitative global phosphorylation studies were 
performed on OPCs across two different time points (1 and 5h after treatment) 
with miconazole, clobetasol, or DMSO treatment using a label-free ultra-high- 
performance liquid chromatography-tandem mass spectrometry (LC-MS/MS) 
workflow without fractionation. Briefly, for each sample 30 million cells were lysed 
with 2% SDS solution with protease and phosphatase inhibitor (Thermo Fisher), 
and detergent was removed on 200 pl of the cell lysate using the FASP cleaning 
procedure”. Each sample was digested by a two-step Lys-C/trypsin proteolytic 
cleavage and subjected to phospho-enrichment using a commercially available 
TiO, enrichment spin tips (Thermo Fisher). LC-MS/MS analysis used a UPLC 
system (NanoAcquity, Waters) that was interfaced to an Orbitrap ProVelos Elite 
MS system (Thermo Fisher). Fold-change calculations were determined from pep- 
tide intensities for each drug versus DMSO at each time point. Phosphopeptides 
with greater than twofold change were imported into Ingenuity Pathway Analysis 
to elucidate signalling pathways perturbed with drug treatment. 

RNA sequencing and analysis. Cells were lysed directly in 1ml TRIzol 
(Invitrogen) and stored at —80 °C. Once all samples were collected, samples were 
thawed on ice and separated with chloroform using Phase Lock Gel tubes (5 
PRIME). RNA was isolated using the miRNeasy Plus Mini Kit (Qiagen) according 
to the manufacturer's protocol. One microgram of each sample was then poly-A 
selected, fragmented, and library prepared using the TruSeq RNA Sample Prep Kit 
(Illumina) according to the manufacturer’s protocol. Samples were indexed using 
TruSeq adapters. One hundred base-pair paired-end reads were generated for each 
sample on an Illumina HiSeq 2500 instrument at the Case Western Reserve 
University Genomics Core facility. Between 5 million and 13 million reads were 
generated per sample for drug time course experiments. EpiSC RNaseq data were 
previously published (GEO accession number GSE57403)”°. EpiSCs, EpiSC OPCs, 
and in vivo OPCs were sequenced to depths of 51,271,458 reads, 61,072,460 reads, 
and 62,530,709 reads, respectively. For in vivo isolated OPCs, CD140a" cells were 
immunopanned from the CNS of mouse pups at postnatal day 7 as described 
previously”’. Cells were then cultured for 5 days in identical culture conditions to 
EpiSC-derived OPCs before analysis. 

Reference genome files were retrieved from Illumina iGenomes (http://cuf- 
flinks.cbcb.umd.edu/igenomes.html). Reads were aligned to the mm9 genome 
using Tophat version 2.0.8 with default settings**. Expression values of known 
RefSeq genes were calculated in units of fragments per kilobase per million reads 
(FPKM) using Cufflinks version 2.0.2 (ref. 29). Expression values were tabled to 
eliminate background signal by converting all values below 0.25 to 0, and subse- 
quently adding 0.25 to all values. FPKMs were quantile normalized to correct for 
inter-sample variation. To identify genes whose expression was perturbed by drug 
treatments, duplicate samples of OPCs were treated with drug or vehicle for 2, 6, or 
12h. RNA sequencing data were tested for differential expression by comparing 
treatments to vehicle at each time point using Cuffdiff version 2.0.2 (ref. 30). The 
collective list of changed genes for each drug was evaluated with Ingenuity 
Pathway Analysis (application build 261899, content version 18030641). 
Western blotting of mouse OPCs. EpiSC-derived OPCs were seeded into poly-L- 
ornithine/laminin coated six-well plates and allowed to attach for 2h in DMEM/ 
F12 without additional factors. Cells were treated with indicated inhibitors or 
DMSO for 1h—SCH772984 (ChemieTek, 1 1M), SB590885 (Tocris, 10nM), 
LY294002 (Tocris, 10 1M), and PD0325901 (Stemgent, 1 1M). Cells were then 
stimulated with drug or FGF2 (R&D Systems; 20 ng ml *) for 1 h and then lysed in 
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200 pL RIPA buffer (0.15 M NaCl, 0.05 M Tris, pH = 8.0, 1 mM EDTA, 1% Triton 
X-100, 0.1% SDS, 10% glycerol, HALT protease and phosphatase inhibitor 
(Thermo Fisher) added just before use) and incubated on ice for 20 min. Lysates 
were centrifuged at 4°C and supernatant collected. Protein concentrations were 
determined by Pierce BCA protein assay kit (Thermo Fisher). Equal amounts of 
protein were resolved in a reduced manner on NuPAGE 4-12% Bis-Tris gels (Life 
Technologies) and transferred onto PVDF membranes (Life Technologies). Blots 
were blocked in either 5% BSA (phosphoprotein) or 5% milk (non-phosphopro- 
tein). Primary antibodies were all from the same vendor (Cell Signaling) and 
included phospho-Erk1/2 (4370S, clone D13.14.4E; 1:2,000), ERK1/2 (9107S, 
clone 3A7; 1:2,000), phospho-glucocorticoid receptor (4161S; 1:1,000), and glu- 
cocorticoid receptor (12041, clone D6H2L; 1:1,000) followed by incubation with 
HRP-conjugated secondary antibodies and chemiluminescent enhancement by 
West Pico substrate (Thermo Fisher). 

Generation and screening of human OPCs. Human OPCs were generated from 
skin fibroblast-derived human iPSC line (CWRU43, Tesar laboratory) and hESC 
lines H7 (NIH Human Embryonic Stem Cell Registry WA07; NIH approval num- 
ber NIHhESC-10-0061) and H9 (NIH Human Embryonic Stem Cell Registry 
WAO09; NIH approval number NIHhESC-10-0062) as previously described”. 
iPSC- and hESC-derived OPCs were characterized by Soxl0 (R&D Systems, 
AF2864; 1:100) staining, and then seeded in 96-well plates at 40,000 cells per well 
for drug testing. Cells were cultured with 11M miconazole, 5 1M clobetasol, or 
vehicle (DMSO) for 21 days, with fresh media changes with drug or vehicle every 
2days. Plates were fixed and stained with MBP (Abcam, ab7349; 1:100) then 
imaged on the Operetta system. We analysed results with slight modification to 
HCA Acapella scripts used for mouse oligodendrocytes. 

Naive CD4* T-cell assays. Naive CD4* T cells (CD4* L-selectin™ cells) were 
purified using AutoMacs Magnetic Bead cell separation technology (Miltenyi 
Biotech) from total lymph node cells isolated from unprimed mice with purity 
ranging from 98 to 99.9%. For in vitro activation, 5 X 10° naive CD4* T cells 
were activated in the presence of plate-bound anti-CD3 (1 1g ml~') plus Th1- 
(200 U ml * interleukin-2 (IL-2), 40 U ml” * IL-12, 10 pg ml“ anti-IL-4) or Th17- 
(10 ng ml! TGF-B1, 50 ng ml! IL-6, 1 Lg ml! anti-IFN-y, 1 pg ml! anti-IL-4, 
1pgml~' anti-IL-2) promoting conditions. On day 4, the cultured T cells were 
collected and the percentage of viable cytokine positive cells assessed by flow 
cytometry. The cells were stained with a LIVE/DEAD Fixable Violet Dead Cell 
Stain Kit, for 405 nm excitation (Life Technologies), anti-CD4-APC/Cy7 (clone 
RM4-5), anti-IFN-y-PerCP/Cy5.5 (clone XMG1.2), and anti-IL-17-APC (clone 
eBiol7B7) (eBioscience). Viable cells (5 X 10°) were analysed per individual sam- 
ple using a BD Canto II cytometer (BD Biosciences), and the data were analysed 
using FloJo version 9.5.2 software (Tree Star). 

Ex vivo lymphocyte recall assays. Female SJL/J (Harlan Laboratories) or C57BL/6 
mice were housed under SPF conditions. Six- to seven-week-old female mice were 
immunized subcutaneously with 100 pl of an emulsion containing 200 1g of 
Mycobacterium tuberculosis H37Ra (BD Biosciences) and 50 tg of PLP)39-151 
(SJL/J) or MOG35_55 (C57BL/6) distributed over three sites on the flank. For ex 
vivo culture draining, lymph nodes on day 8 were collected and cells were activated 
in the presence of anti-CD3 (1 jg ml’) in the absence or presence of clobetasol, 
miconazole, or benztropine (10° °-10-° M). To assess total cellular proliferation, 
cultures were pulsed with tritiated thymidine (1 Ci) at 24h and cultures were 
harvested at 72h. In replicate wells, culture supernatants were harvested at 72h 
after culture, and the level of IFN-y and IL-17 were assessed via Luminex assay 
(Millipore). 

PLP39_151-induced relapsing remitting EAE. Six- to seven-week-old female 
SJL/J mice were induced with PLPj39_151 as for ex vivo recall assays. Mice were 
allowed to progress to disease onset at day 13 before being randomized into 
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treatment groups (n = 10 mice per group). Mice were then monitored for paralysis 
and treated daily by intraperitoneal injection with vehicle (DMSO in sterile saline), 
benztropine (10 mg/kg), clobetasol (2 mg/kg), or miconazole (10 mg/kg) begin- 
ning on day 13 and ending on day 29. This period fully encompassed the acute 
phase of disease onset followed by remission and the primary disease relapse. 
Treatments were blinded to the experimenters performing the assays. Mice were 
followed for disease severity in a blinded fashion with disease scoring as follows: 0, 
no abnormality; 1, limp tail; 2, limp tail and hind limb weakness; 3, hind limb 
paralysis; 4, hind limb paralysis and forelimb weakness; and 5, moribund. 
MOG;;_55 chronic progressive EAE model. EAE was induced by immunizing 
10-week-old C57BL/6 female mice with 100 ull injection of MOG35_55/complete 
Freund’s adjuvant emulsion (Hooke Laboratories). One hour after immunization, 
mice were given 100 ng of pertussis toxin by 100 pil intraperitoneal injection. A 
second dose of pertussis toxin was administered the next day. EAE onset was 
monitored daily and scored using a clinical scale where 0 represented no abnor- 
mality; 1, limp tail; 2, limp tail and hind limb weakness; 3, hind limb paralysis; 4, 
hind limb paralysis and forelimb weakness; and 5, moribund. Mice that appeared 
moribund or exhibited forelimb paralysis were immediately euthanized and not 
used for the study. Once mice reached peak of disease (~day 15; clinical score = 3) 
they were randomized into treatment groups, and drug or vehicle (DMSO in 
saline) was administered intraperitoneally daily for 10 days (n = 12-16 mice per 
group). Doses for each drug were miconazole (10 mg/kg), clobetasol (2 mg/kg), 
and benztropine (10 mg/kg). At these doses, no drugs showed any overt evidence 
of toxicity to any of the animals. Experimenters were blinded to the identity of the 
treatments and animals were scored daily. Cumulative disease scores for each 
animal were calculated during the treatment period, and a two-tailed t-test com- 
pared drug- with vehicle-treated groups. The extent of recovery for each animal 
was calculated as the difference between the peak disease score and the score at the 
end of each experiment, and a two-tailed t-test was used to compare each treat- 
ment with vehicle. External validation of MOG3s_5; EAE experiments (n = 12 
mice per group) was performed at Hooke Laboratories with experimenters blinded 
to the identity of the substances. FTY720 (fingolimod, 1 mg/kg), a known immu- 
nomodulatory drug, was used as a positive control during external validation of 
miconazole (10 mg/kg). 

Animal welfare. All animal experiments were performed in accordance with 
protocols approved by the Case Western Reserve University and Northwestern 
University Institutional Animal Care and Use Committees. 
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Extended Data Figure 1 | Performance of the primary screen. 

a, Representative flow cytometry plots showing co-expression of NG2 and 
CD140a in both batches of EpiSC-derived OPCs used for this study. The 
batches of EpiSC-derived OPCs were sorted to purity (circled areas of plots) 
before use in this study. b, RNaseq expression heat map showing 
downregulation of pluripotent stem cell transcripts and upregulation of OPC 
transcripts when EpiSCs were differentiated into OPCs. Fragments per kilobase 
exon per million reads (FPKM) for each transcript are shown compared with in 
vivo isolated mouse OPCs. c, Quantification of DMSO (v/v) tolerance of EpiSC- 
derived OPCs in 96-well plates shown as mean + s.e.m. For reference, 0.05% 
(v/v) DMSO was used as vehicle for all in vitro experiments in this study; n = 16 
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wells per group with >690 cells scored per well. d, Quantification of cell 
viability of thyroid hormone (positive control) and DMSO vehicle treatments 
per well across all ten assay plates shown as mean + s.e.m.; n = 80 wells per 
group with >6,800 cells scored per well. e, Signal to background (S/B) mean 
values with standard deviation (s.d.) of controls from the entire screen; n = 80 
wells per group. f, Raw data of MBP process length from the primary screen for 
thyroid hormone treatment and DMSO vehicle across all plates shown as 
mean + s.d.; n = 8 wells per group with >6,800 cells scored per well. g, Raw 
data of MBP process intensity from the primary screen for thyroid hormone 
treatment and DMSO vehicle shown as mean + s.d.; n = 8 wells per group with 
>6,800 cells scored per well. 
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Extended Data Figure 2 | Drug hit ranking and validation. a, Chart ranking 
the 22 primary drug hits (single dose rank) into four tiers on the basis of 
calculation of ECsp to induce PLP1* oligodendrocytes from OPCs and the 
concentration at which 50% of the cells were lost (50% Tox) calculated from a 
seven-point dose treatment; n = 4 wells per dose per drug using independently 
sourced drug and separate OPC batch from the primary screen. Tiers ranged 
from the most potent and least toxic effectors to the least potent and most toxic: 
tier a (green), tier b (grey), tier c (orange), and tier d (red). The 1,536-well 


format external validation of 14 out of 16 tested hits is also shown. Drugs were 
further ranked into groups of high (green), medium (grey), and low (orange) on 
the basis of their ability to increase MBP” axonal ensheathment in mouse 
cerebellar brain slices relative to vehicle (DMSO)-treated controls as measured 
by HCA. NT, not tested. b, External validation whole 1,536-well images of 
MBP* (green) oligodendrocytes generated from OPCs after 72 h of treatment. 
GE InCell HCA is shown with processes traced in yellow and nuclei in blue. 
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Extended Data Figure 3 | Primary screen structure-activity analysis. 
Chemoinformatic identification of two substructures consistently enriched in 
high-performing drugs in the OPC assay. Numerical activity rank in the 
primary screen is indicated with the top 22 shown in green, 23-50 shown in 
grey, and >51 in red. a, 1,3-Diazoles, mono-substituted at the 1-position 
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Screen 
Rank 
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showed consistent activity on OPCs. b, c, 1,3-Diazoles, poly-substituted at two 
or more of the R groups (b), or 1,2,4-triazoles, mono-substituted at the 
1-position (c) showed no activity on OPCs. d, The sterane base structure 
showed enrichment in the top performing hits. 
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Extended Data Figure 4 | Histological assessment of remyelination in the _ blue showing the extent of remyelination in the lesions of treated animals at 
LPC-induced model of demyelination. a, Representative electron 12 d.p.l. Normal uninjured myelin appears to the left of the black dashed line 


micrographs showing remyelinated axons within lesions of miconazole treated  demarcating the definitive lesion edge. Scale bar, 20 um. 
mice at 8 d.p.l. Scale bar, 2 um. b, Histological sections stained with toluidine 
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Extended Data Figure 5 | Miconazole and clobetasol enhance myelination —_ developmental myelination (b). Clobetasol and miconazole treatment each 
in vivo. a, b, Representative immunohistochemical images of the lateral corpus _ induce a marked increase in the number of CC1-positive oligodendrocytes in 


callosum (CC) of postnatal day 6 mouse pups that had been injected the lateral corpus callosum (a) and a significant increase in the length of the 
intraperitoneally daily for 4 days previously starting on postnatal day 2 with corpus callosum covered with aligned MBP* fibres (b). Scale bar, 200 jum. Two- 
vehicle, clobetasol (2 mg/kg), or miconazole (10 mg/kg). CC1 (red) marks tailed t-test, *P = 0.05 and **P = 0.01. Str, striatum. All graphs are presented as 
newly generated oligodendrocytes (a) and MBP (green) shows the extent of mean + s.e.m. 
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Extended Data Figure 6 | RNaseq time course of drug-treated OPCs. 
expressed at any time point and increased in treatments versus vehicle (left), as 


a, Volcano plots of all genes from OPCs treated with clobetasol or miconazole 
relative to vehicle control, with differentially expressed genes highlighted (red). _ well as those decreased in treatments versus vehicle (right). c, Significant 
Significance (measured as —log)o[q value]) is plotted in relation to expression _ canonical pathways perturbed by each drug treatment according to Ingenuity 


change (log,[treatment/vehicle]). Time course was after 2,6, and 12h of drug Pathway Analysis. 
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Extended Data Figure 7 | Global phosphoproteomic analysis of 
miconazole-treated OPCs. a, b, OPCs treated with miconazole for (a) 1h or 
(b) 5h followed by global phosphoproteomic analysis. Proteins highlighted in 
green were observed to have a twofold or greater increase in phosphorylation 
whereas those highlighted in red were observed to have a twofold or greater 
decrease in phosphorylation compared with time-point-matched vehicle- 
treated controls. Proteins highlighted in grey were detected in the analysis but 
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voriconazole (uM) 
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were not changed compared with vehicle control. See Supplementary Table 3 
for the full phosphoproteomic data set. c, Quantification of the percentage of 
MBP* oligodendrocytes differentiated from mouse OPCs after 72h of 
treatment with DMSO, miconazole (1 1M), or voriconazole (seven doses, 
0.01-6.7 1M); n = 6 wells per condition with >6,000 cells scored per well. 


Graph presented as mean + s.e.m. The chemical structure of voriconazole 
is shown. 
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Extended Data Figure 8 | Miconazole and clobetasol enhance human OPC _ vehicle (DMSO), miconazole (1 1M), or clobetasol (5 |1M) for 21 days stained 
differentiation. a, Representative phase contrast image of a hESC colony for MBP (red). f, g, HCA of hESC-derived (f) and hiPSC-derived (g) OPCs 
cultured on matrigel. b, Representative phase contrast image of hESC-derived _ differentiated in the presence of drugs or vehicle over 21 days; n = 3-5 wells per 
OPCs. c, hESC-derived OPCs stain positive for Sox10. d, e, Representative condition with >120 cells scored per well. Graphs presented as mean + s.e.m. 
images of hESC-derived OPCs (d) and hiPSC-derived OPCs (e) treated with Scale bars, 100 um. 
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(ACPM) (e, h), with IFN-y (f, i) and IL-17 (g, j) cytokine production from 
lymphocytes of mice primed with PLP} 39-151 (e-g) or MOG35_55 (h-j). Cultures 
were treated with vehicle (DMSO), benztropine, clobetasol, or miconazole 
(10-°-10-° M) and analysed after 4 days. Four independent replicates are 
shown for each assay. 


Extended Data Figure 9 | Effects of miconazole and clobetasol on immune 
cell survival and function. a-d, Quantification of cell proliferation (a, c) and 
differentiation (b, d) of naive CD4* T cells from unprimed SJL/J mice after 
activation with plate-bound anti-CD3 under Th1 (a, b) or Th17 (c¢, d) cell 
driving conditions. e-j, Ex vivo recall assays quantifying cell proliferation 
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Extended Data Figure 10 | Histological improvements in MOG;;;, EAE _ areas, especially in clobetasol-treated animals, but not an abrogation. 

spinal cords after treatment with miconazole or clobetasol. a, d, e, Representative images stained with toluidine blue (d) and electron 

b, Representative images of luxol fast blue (LFB) staining (a) demonstrated a —_—s micrographs (e) revealed a reduction in the areas of demyelination in drug- 
clear decrease in areas of white matter disruption in the spinal cords of drug- _ treated animals. Lesioned areas are outlined with black dotted lines. Insets in 
treated animals which coincides with increased MBP staining (b). c, IBA1 toluidine blue staining show higher magnification of myelination in the 
staining showed a small reduction of immune cell infiltration into the lesion corresponding spinal cords. Scale bars, 100 um (a-c, d) and 2 1m (e). 
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Intrinsic retroviral reactivation in human 
preimplantation embryos and pluripotent cells 


Edward J. Grow’, Ryan A. Flynn’, Shawn L. Chavez**°, Nicholas L. Bayless®, Mark Wossidlo!**, DanielJ. Wesche’, Lance Martin’, 
Carol B. Ware’, Catherine A. Blish*, Howard Y. Chang’, Renee A. Reijo Pera’*"? & Joanna Wysocka?!0"" 


Endogenous retroviruses (ERVs) are remnants of ancient retro- 
viral infections, and comprise nearly 8% of the human genome’. 
The most recently acquired human ERV is HERVK(HML-2), 
which repeatedly infected the primate lineage both before and after 
the divergence of the human and chimpanzee common ancestor””’. 
Unlike most other human ERVs, HERVK retained multiple copies 
of intact open reading frames encoding retroviral proteins’. 
However, HERVK is transcriptionally silenced by the host, with 
the exception of in certain pathological contexts such as germ-cell 
tumours, melanoma or human immunodeficiency virus (HIV) 
infection®’. Here we demonstrate that DNA hypomethylation at 
long terminal repeat elements representing the most recent geno- 
mic integrations, together with transactivation by OCT4 (also 
known as POUSF1), synergistically facilitate HERVK expression. 
Consequently, HERVK is transcribed during normal human 
embryogenesis, beginning with embryonic genome activation at 
the eight-cell stage, continuing through the emergence of epiblast 
cells in preimplantation blastocysts, and ceasing during human 
embryonic stem cell derivation from blastocyst outgrowths. 
Remarkably, we detected HERVK viral-like particles and Gag pro- 
teins in human blastocysts, indicating that early human develop- 
ment proceeds in the presence of retroviral products. We further 
show that overexpression of one such product, the HERVK access- 
ory protein Rec, in a pluripotent cell line is sufficient to increase 
IFITM1 levels on the cell surface and inhibit viral infection, sug- 
gesting at least one mechanism through which HERVK can induce 
viral restriction pathways in early embryonic cells. Moreover, Rec 
directly binds a subset of cellular RNAs and modulates their ribo- 
some occupancy, indicating that complex interactions between 
retroviral proteins and host factors can fine-tune pathways of early 
human development. 

Given the substantial contribution of transposons to the human 
genome and their emerging roles in shaping host regulatory net- 
works*”, understanding the dynamic expression and function of these 
genomic elements is important for dissecting both human- and prim- 
ate-specific aspects of gene regulation and development. We used 
published single-cell RNA-sequencing (RNA-seq) data sets to analyse 
the expression of major transposon classes at various stages of human 
preimplantation embryogenesis’’, a developmental period associated 
with dynamic changes in DNA methylation and transposon express- 
ion''. This analysis revealed two major clusters, one primarily consist- 
ing of repeats that begin to be transcribed at the onset of embryonic 
genome activation (EGA), which in humans occurs around the eight- 
cell stage, and a second cluster of repeats, whose transcripts can be 
detected in the embryo before EGA, indicating maternal deposition 
(Extended Data Fig. 1a). Within each cluster, more discrete stage- 


specific changes in repeat transcription could be observed, such that 
analysis of the repetitive transcriptome alone was able to distinguish 
pre- and post-EGA cells, as well as eight-cell/morula cells from blas- 
tocyst cells (Extended Data Fig. 1a). For example, HERVK and its 
regulatory element, long terminal repeat (LTR)5HS, were both 
induced in eight-cell stage embryos, morulae, and continued to be 
expressed in epiblast cells of blastocysts (Fig. la-c and Extended 
Data Fig. la). We further observed that although HERVK was 
expressed in blastocyst outgrowths (passage 0 human embryonic stem 
(ES) cells), it was downregulated by passage 10 (Fig. 1d). In contrast, 
transcripts of another HERV, HERVH, and of its regulatory element 
LTR7, were detected before EGA and throughout preimplantation 
development, including in all blastocyst lineages and human ES cells 
(Extended Data Fig. la-c). 

Recent studies have reported conditions for capturing a human 
naive pluripotent state in vitro’?"'°, and we used RNA-seq to analyse 
the repetitive transcriptome of ELF1, a cell line derived from an 
eight-cell-stage human embryo under naive culture conditions, and 
compared it to the repeat expression in ELF1 cells matured in vitro 
into a primed state’*. Surprisingly, although many transposon classes 
(for example, HERVH and LINE1-HS) were highly expressed in both 
cell states, only a few showed differential levels between the two 
(Fig. le). In particular, transcripts corresponding to HERVK pro- 
viruses and their regulatory elements, LTR5HS (but not the older 
LTR5a or LTR5b; see later), were among the most strongly induced 
in naive versus primed ELF1 cells (Fig. le and Extended Data Fig. 1d). 
Similar results were obtained by analysing available transcriptomes of 
primed H1 human ES cells and naive 3iL cells derived from them, as 
well as of primed H9 human ES cells and those ‘reset’ to the naive state 
by NANOG and KLF2 transgene expression'’*'*(Fig. le). Therefore, 
naive-state-specific upregulation of HERVK is consistent across mul- 
tiple genetic backgrounds, derivation methods or culture conditions. 

From an evolutionary perspective, HERVK is especially interesting, 
as it is the most recently acquired HERV from which multiple inser- 
tions have retained protein-coding potential’” (Extended Data Fig. 2a). 
While HERVK is present in all Old World primates, nearly a third of 
its proviruses in the human genome represent human-specific inser- 
tions, and 48% of those show polymorphisms in the human popu- 
lation, suggesting that HERVK was active within the last 200,000 
years’* (Extended Data Fig. 2a). All human-specific and human- 
polymorphic HERVK elements are regulated by a specific LTR sub- 
group, LTR5HS, whereas insertions representing older integrations 
typically have regulatory elements of the LTR5a or LTR5b subtype* 
(Extended Data Fig. 2a). Interestingly, during human preimplantation 
development and in the naive state, transcripts originating from 
LTRS5HS, but not LTR5a or LTR5b, are preferentially expressed 
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Figure 1 | Transcriptional reactivation of 
HERVK in human preimplantation embryos and 
naive human ES cells. a, Schematic of human 
preimplantation development. b, HERVK 
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(Fig. le), and we observed an upregulation of human-specific pro- 
viruses compared to evolutionarily older elements (Fig. 2a). We 
hypothesized that this differential regulation can be explained by cis- 
regulatory change in LTR5HS. Indeed, sequence analysis uncovered an 
OCT4 motif at position 693-699 base pairs (bp) of LTR5HS, which 
was conserved across diverse LTR5HS sequences, but not present in 
LTR5a/LTR5b, despite their overall high (~88%) sequence homology 
with LTR5HS (Fig. 2b and Extended Data Fig. 2a). To test whether 
OCT4 binding contributes to the transcriptional activation of 
LTRSHS, we used pluripotent NCCIT human embryonic carcinoma 
(EC) cells, which express OCT4, but, in contrast to human ES cells, are 
permissive for HERVK expression®”” (Extended Data Fig. 2b-d). 
Chromatin immunoprecipitation with quantitative polymerase chain 
reaction (ChIP-qPCR) analysis of human EC cells showed preferential 
occupancy of OCT4, p300 and histone marks of active chromatin at 
LTRSHS elements, as compared to LTR5a/LTR5b (Fig. 2c). In con- 
trast, we did not detect OCT4 or p300 binding at LTRS5HS in primed 
human ES cells (Extended Data Fig. 2f). Consistent with a functional 
role in HERVK activation, knockdown of OCT4 or SOX2, but not of 
NANOG, led to a significant decrease in viral transcripts in human EC 
cells (Extended Data Fig. 2e and Extended Data Fig. 3a). Furthermore, 
the activity of transcriptional reporters driven by LTR5HS was 
impaired by mutations in the OCT4 motif (Fig. 2d and Extended 
Data Fig. 3b). 

The aforementioned observations are consistent with transactiva- 
tion by OCT4 being a driver of LTR5HS regulatory activity, but do not 
explain the differential transcriptional status of HERVK in primed 
versus naive human ES cells and in human EC cells, as all three express 
OCT4. We hypothesized that DNA methylation may contribute an 
additional layer of regulation, and indeed we observed HERVK hypo- 
methylation of solo and proviral LTR5HS (but not the Gag open 
reading frame (ORF)) in human EC cells and naive ES cells, as com- 
pared to primed human ES cells and human induced pluripotent stem 
cells (iPSCs) (Fig. 2e and Extended Data Fig. 3c, d). Strong and pref- 
erential demethylation of LTR5HS was also observed in recently pub- 
lished DNA methylation maps from human preimplantation embryos, 
whereas HERVK coding sequences remained more highly methy- 
lated". Importantly, treatment of primed human ES cells with a 
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biological replicates for both conditions). Middle, 
3iL/primed H1 human ES cells (data are taken 
from ref. 12) (n = 3 biological replicates for both 
conditions). Right, naive/primed H9 human ES 
cells (data are taken from ref. 15) (n = 3 biological 
replicates for both conditions). Significant repeats 
indicated in red at false discovery rate 

(FDR) < 0.05, DESeq. hESC, human ES cells. 


DNA methylation inhibitor, 5-aza-2'-deoxycytidine, for 24h induced 
HERVK transcription, with 8-12-fold upregulation of an early tran- 
script encoding an accessory protein, Rec (Fig. 2f). In addition, inhibi- 
tion of DNA methylation, together with overexpression of OCT4 and 
SOX2, jointly facilitated HERVK transcription in HEK293 cells 
(Fig. 2g and Extended Data Fig. 3e), indicating that DNA hypomethy- 
lation and transactivation by OCT4 synergistically promote HERVK 
expression. 

A defining characteristic of HERVK is that multiple proviruses have 
retained ORFs encoding full-length retroviral proteins*. Consequently, 
HERVK reactivation in pathological conditions has been associated 
with the presence of HERVK proteins*’, prompting us to examine 
whether retroviral proteins are also present in human embryos. We 
used a well-characterized monoclonal antibody recognizing the 
HERVK Gag precursor and its proteolytically processed form 
Capsid, which detects cytoplasmic signal with a characteristic punctate 
pattern in human EC cells and in a subset of naive ELFI cells, but 
shows no staining in primed human ES cells and loss of signal in 
human EC cells after gag short interfering RNA (siRNA) knockdown 
(Extended Data Fig. 4a—d). In human blastocysts, Gag/Capsid staining 
was also detected in dense cytoplasmic puncta resembling those seen 
in human EC cells and naive ELF] cells (Fig. 3a and Extended Data Fig. 
4a, d, e), with all analysed blastocysts (n = 19/19) showing a robust 
signal. Several HERVK-positive human EC cell lines have been shown 
to produce viral-like particles (VLPs)’*’. Remarkably, heavy metal 
staining transmission electron microscopy (TEM) of blastocysts 
revealed the presence of cytoplasmic, electron-dense particles of 
approximately 100 nm in diameter—the reported size of reconstructed 
HERVK VLPs—with electron-lucent cores*’”” (Fig. 3b). Additionally, 
human blastocyst cells also contained cytosolic vesicles enclosing 50 or 
more smaller, highly electron-dense particles of approximately 75 nm 
in size, which resembled the immature VLPs also seen in human EC 
cells (Fig. 3c and Extended Data Fig. 5a). The presence of HERVK- 
derived particles in human blastocysts was further supported by 
immuno-gold TEM staining, which detected VLPs (or vesicles with 
multiple VLPs) labelled by Gag/Capsid antibodies either within 
embryonic cells or on the cell surface, similar to those seen in 
immuno-gold TEM staining of human EC cells (Fig. 3d, e and 
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Figure 2 | Transactivation by OCT4 and DNA hypomethylation of LTR5HS 
synergistically regulate HERVK transcription. a, Expression of different 
HERVK proviral sequences, grouped according to the oldest common ancestor, 
as defined previously*. *P value < 0.05, non-paired Wilcoxon test. Solid 
line indicates mean. RNA-seq data set used for the analysis was from 3iL naive 
H1 cells’; n = 3 biological replicates. b, Conserved OCT4 site in LTR5HS with 
position weight matrix of the corresponding motif shown for comparison (top). 
Presence/absence of OCT4 motif in distinct LTR5 sequences is indicated 
(bottom); more detailed sequence information is in Extended Data Fig. 2a. 
c, ChIP-qPCR analyses from human EC cells (NCCIT) using antibodies 
indicated on top of each graph. Signals were quantified using primer sets 
specific to LTR5HS (5HS), LTR5a (5a) and LTR5b (5b) consensus sequences or 
two ‘negative’ intergenic, non-repetitive regions (neg1, neg2). *P value < 0.05 
compared to negative control, one sided t-test. n = 4 biological replicates, error 
bars are +1 standard deviation (s.d.). d, Flow cytometry analysis of human EC 
cells with integrated LTR5HS fluorescent reporters, either wild type (middle) or 
with OCT4 motif mutation (bottom). Red fluorescent protein (RFP)-positive 


Extended Data Fig. 5b); control blastocyst staining showed no signal 
from secondary antibody (Extended Data Fig. 5c). Altogether, these 
data demonstrate that human preimplantation development proceeds 
in the presence of retroviral proteins and VLPs (summarized in 
Extended Data Fig. 5d). 

Recent studies highlight the ability of TEs to contribute regula- 
tory sequences to mammalian genomes’”*™. For example, MERV-L 
elements in mouse have been reported to function as alternative pro- 
moters, driving expression of many two-cell stage-specific chimaeric 
transcripts’’. However, we did not detect robust evidence for HERVK- 
associated chimaeric transcription (Extended Data Fig. 6a, b and 
Supplementary Table 1), suggesting that LTR5HS is unlikely to con- 
tribute promoter activity to nearby host genes. Alternatively, LTR 
sequences derived from ERVs could be co-opted to act as long- 
distance enhancers for the host**. In agreement with such a possibility, 
LTRSHS elements were marked by p300 and H3K27ac (Fig. 2c), while 
genes located in their vicinity showed a strong bias for naive- 
state-enriched expression, regardless of their upstream or downstream 
position in relation to the LTR5HS (Extended Data Fig. 6c-e). 
However, we cannot rule out that this result could be a consequence 
of preferential HERVK integration near genes active in the naive state. 

HERVK encodes a small accessory protein, Rec, homologous to 
HIV Rev, which binds to and promotes nuclear export and translation 
of viral RNAs”. rec, an early viral transcript derived through alterna- 
tive splicing of the env gene (Extended Data Fig. 2a), is expressed in 


(42.5% methylated) 


population was gated using side-scatter area (SSC-A) and cells with integrated 
negative control reporter (top) containing minimal thymidine kinase (miniTK) 
promoter. Shown is a representative result of two independent experiments. 
e, Bisulfite conversion quantification of LTR5HS 5-methyl-cytosine levels 
measured using LTR5HS-specific primer pairs anchored in the LTRSHS 
consensus sequence (left) or provirus-specific 5’ LTR5HS (right) for human EC 
cells (hECC; NCCIT), human ES cells (hESC; H9) or naive human ES cells 
(ELF1). Filled circles depict modified cytosines, open circles depict unmodified 
cytosines. Human EC cells (NCCIT) and naive human ES cells (ELF1) are less 
methylated than primed human ES cells (H9). P< 0.05, non-paired Wilcoxon 
test. f, qPCR with reverse transcription (RT-qPCR) analysis of human ES cells 
(H9) treated with indicated concentrations of 5-aza-2'-deoxycytidine for 

24 hours. *P value < 0.05, one-sided t-test. n = 3 biological replicates, error 
bars +1 s.d. g, RT-qPCR analysis of HERVK rec RNA levels in HEK293 cells 
treated with indicated concentrations of 5-aza-2’-deoxycytidine, followed by 
transfection with OCT4/SOX2 expression constructs. *P value < 0.05, one- 
sided t-test; NS, not significant. n = 4 biological replicates, error bars +1 s.d. 


naive ES cells and human blastocysts, and is rapidly induced in primed 
human ES cells exposed to 5-aza-2’ -deoxycytidine (Extended Data Fig. 
7a and Fig. 2f). We hypothesized that Rec-mediated nuclear export of 
viral RNAs into the cytoplasm might ultimately lead to the induction 
of innate antiviral responses, which typically rely on cytosolic detec- 
tion of viral RNA/DNA and protein. We noted a striking induction of 
messenger RNA encoding an interferon-induced viral restriction fac- 
tor IFITM1 (ref. 26; also known as FRAGILIS2) in human epiblast 
cells'®, as well as upregulation of IFITM1 transcripts and surface pro- 
tein levels in human naive versus primed human ES cells (Extended 
Data Fig. 7b, c, fand Supplementary Table 6). Furthermore, expression 
of a rec transgene in human EC cells was sufficient to elevate surface- 
localized IFITM1 protein levels (Fig. 4a). This was at least in part 
mediated through an effect on IFITM1 mRNA transcription or 
stability, as Rec overexpression or knockdown had, respectively, 
increased or decreased IFITM1 mRNA levels (Extended Data Fig. 7d). 
Of note, although the minimal components of the JAK/STAT interferon 
pathway are present in human EC cells, many other interferon-induced 
genes are not upregulated or expressed, indicating that HERVK triggers 
a precise antiviral response in host cells (Supplementary Table 2). To test 
whether HERVK expression provides viral resistance, we infected con- 
trol wild-type human EC cells, control human EC cells expressing a 
green fluorescent protein (GFP) transgene, or two independent clonal 
Rec human EC cell lines (Rec-hECCs) with influenza H1N1(PR8) virus. 
Interestingly, the Rec-hECCs exhibited substantially attenuated infec- 
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Figure 3 | Human blastocysts contain HERVK proteins and viral-like 
particles. a, Immunofluorescence of human blastocysts (days post-fertilization 
(DPF) 5-6) stained with 4’,6-diamidino-2-phenylindole (DAPI; blue), OCT4 
antibody (green), and HERVK Gag/Capsid antibody (red). Images show a 
representative example (n = 19 embryos). Scale bar = 50 um. White arrow 
indicates an OCT4* nucleus, surrounded by cytoplasmic Gag/Capsid (Cap), 
which is shown with higher magnification in an inset. b, Heavy metal staining 
TEM of a human blastocyst. Arrow indicates putative VLP (found in n = 2/3 
blastocysts, DPF 5-6). Higher magnification of indicated region is shown in 
inset. Scale bar = 200 nm. c, Heavy metal staining TEM of human blastocyst. 
Arrow indicates putative immature VLP, bracket indicates vesicle filled with 
putative VLP (found in n = 2/3 blastocysts, DPF 5-6). Scale bar = 100 nm. 

d, e, Immuno-TEM of human blastocysts with Gag/Capsid staining; region of 
higher magnification is boxed. Representative examples of budding (d) and 
cell-internal (e) particles are shown; n = 3 blastocysts (DPF 5-6), n = 3 labelled 
particles in two embryos. 


tion levels as compared to the control GFP-hECCs (Fig. 4b) or wild-type 
human EC cells (Extended Data Fig. 7e). 

Retroviral accessory proteins often masterfully manipulate host cell 
factors to achieve optimal replicative efficiency. To examine whether, 
beyond reported binding to HERVK 3’ LTRs”, Rec can also assoc- 
iate with cellular RNAs, we performed tandem affinity purification 


iCLIP-seq in human EC cells expressing Flag-GFP (GFP) or Flag- 
GFP-tagged rec transgene (Extended Data Fig. 8a, b). We did not 
detect associated RNA in the control Flag-eGFP purifications, indi- 
cating low nonspecific RNA recovery of our assay (Extended Data Fig. 
8b). In contrast, parallel Rec purifications from two Flag—-GFP Rec 
expressing clones yielded ultraviolet-crosslinked RNAs, sequencing 
of which demonstrated that in vivo, Rec robustly binds LTR5HS, but 
only in the region previously defined as containing the highly struc- 
tured Rec-responsive element**’* (Fig. 4c and Extended Data 
Fig. 8b, c). In addition, Rec directly interacts with ~1,600 host 
mRNAs, preferentially in their 3’ untranslated regions (UTRs), a posi- 
tional preference analogous to that observed in the viral RNA 
(Fig. 4d, e, Extended Data Fig. 9a and Supplementary Table 3). We 
did not detect specific RNA sequence motifs enriched at Rec-bound 
sites; however, multiple examined Rec iCLIP targets were predicted to 
fold into stable secondary structures (Extended Data Fig. 9b). This is 
reminiscent of the interaction of Rec with its HERVK LTR response 
element, which is mediated by RNA secondary structure, rather than 
a discrete specific binding site’. We also observed Rec association 
with mRNAs encoding surface receptor molecules and ligands (for 
example, FGFR1, FGF13, FGFR3, KLRG2, IGF1R, FZD7, GDF3) and 
chromatin regulators (for example, DNMT1, CHD4) (Extended Data 
Fig. 9a and Supplementary Table 3). 

Given that Rec binding to viral RNAs promotes their nuclear export 
and translation, we next examined if endogenous mRNAs bound by 
Rec are also more efficiently targeted to ribosomes”*”’. Ribosome pro- 
filing of Rec-hECCs, in comparison to wild-type human EC cells, 
revealed both increases and decreases in ribosomal occupancy, with 
differential enrichment of 941 mRNAs, of which 134 were also Rec 
iCLIP targets, representing a significant overlap (P value < 0.05, 
hypergeometric test) (Fig. 4f and Supplementary Table 5). Notably, 
mRNAs bound by Rec in 3’ UTRs or coding sequences were more 
likely to be upregulated in their ribosomal occupancy than expected 
by chance (P value< 0.05, hypergeometric test), but we did not 
observe such enrichment for mRNAs bound in their 5’ UTRs. We also 
noticed that several Rec-bound transcripts (for example, RPL22, 
RPL31, RPS13, RPS20, EIF4G1) encoding ribosome components 
and translation regulators had increased occupancy in Rec-hECCs, 
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Figure 4 | HERVK accessory protein Rec upregulates viral restriction 
pathway and engages cellular mRNAs. a, Flow cytometry histograms of 
IFITM1 surface staining in control human EC cells or Rec-hECC cells; 
histogram of negative control cells stained with isotype IgG* Alexa-647 
secondary antibody (A-647) is shown for comparison. Shown is a 
representative result of two independent experiments. b, HIN1(PR8) influenza 
infection of control GFP-hECC cells or two clonal lines of Rec-hECCs. Control 
cells were set as 100%, shown is aggregate results from two independent 
experiments, n = 8 total biological replicates for each condition. Error bars are 
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+1 s.d. **P value < 0.005, one-sided t-test. c, Rec iCLIP reads mapped to the 
LTR5HS sequence, n = 2 biological replicates. d, Distribution of Rec binding 
sites on endogenous mRNAs (top) and aggregate Rec iCLIP-seq signal on a 
metagene (bottom), n = 2 biological replicates. CDS, coding DNA sequence. 
e, Distribution of Rec iCLIP reads at representative target mRNAs KLRG2 
(top), RPL22 (bottom); y-axis, iCLIP score, at cut-off = 3 (see Methods for 
details). f, Ribosome profiling signal for all significant genes (FDR < 0.05 
Cuffdiff) in wild-type human EC cells versus Rec-hECCs, n = 4 biological 
replicates. Rec iCLIP targets are coloured in red. 
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potentially contributing to additional indirect translational effects of 
Rec overexpression (Fig. 4e, f and Supplementary Table 5). 

Altogether, our results demonstrate that early human development 
is accompanied by the stage-specific transcriptional activation of 
HERVK, translation of its ORFs, and assembly of VLPs (Extended 
Data Fig. 10a). Beyond preimplantation development, we predict that 
HERVK reactivation occurs in human primordial germ cells (PGCs), 
which are also characterized by the presence of OCT4 and genome- 
wide DNA hypomethylation”. HERVK protein products have the 
potential to engage host machinery, as exemplified here by modulation 
of cellular mRNAs by Rec. This fine-tuning of cellular functions by 
HERVK proteins may contribute to human-specific or even indi- 
vidual-specific aspects of early development, as the retroviral ORFs 
are preferentially expressed from the human-specific proviruses, many 
of which are polymorphic in the human population*"’. Finally, our 
data raise the intriguing possibility that HERVK provides an immu- 
noprotective effect for human embryos against different classes of 
viruses sensitive to the IFITM1-type restriction. Although IFITM 
family members were first described as interferon-induced genes, they 
are also classical naive-state and PGC markers in the mouse, which 
nonetheless appear to be dispensable for development”’. These obser- 
vations suggest that IFITM1-mediated restriction may be a evolutio- 
narily conserved mechanism protecting both embryos and germ cells 
from either reinfection from infectious ERVs or exogenous viral infec- 
tion (Extended Data Fig. 10a). 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


DNA and RNA isolation at reverse transcription. Genomic DNA was isolated 
using phenol:chloroform:isoamyl (100:100:1; PCI) (Invitrogen). Briefly, cells were 
digested in 10 mM Tris-HCl (pH = 8.0), 0.1 M EDTA, 0.5% SDS for 37 °C for 1h, 
then proteinase K was added to final concentration of 100 1gml~* and then 
incubated for 3h at 50°C. DNA was PCI extracted, ethanol precipitated, and 
resuspended in TE. RNA was extracted using Trizol (Invitrogen) according to 
the manufacturer’s instructions. DNase treatment with Turbo DNase (Ambion) 
was performed for 30 min at 37 °C, PCI extracted, ethanol precipitated, and re- 
suspended in water. Reverse transcription was performed with SuperScript III 
(Invitrogen) using ~500 ng of DNAase treated total RNA following the manu- 
facturer’s instructions. No reverse transcriptase controls were performed where 
necessary. 

Cell lines and culture. NCCIT and HEK293 cells were obtained from ATCC. 
NCCIT cells were maintained in 10% FBS (Omega), 1X Glutamax-I supplement 
(100 stock, Invitrogen), 1x non-essential amino acids (100X stock, Invitrogen), 
and basal media RPMI 1640 (Hyclone). HEK293 cells were maintained in 10% 
FBS, 1x NEAA, 1X glutamax in DMEM-high glucose (Hyclone). Human ES cells 
(H9 line female, Wi-Cell) were used at passage 60-67 and were expanded in 
feeder-free, serum-free medium (mTeSR-1) from StemCell technologies. HESC 
HSF-1 (male) and HSF-8 (male) human ES cells were used at passage 20-28, 
cultured as described earlier and their characterization is described elsewhere’. 
Cells were passaged 1:7 every 5-6 days by incubation with accutase (Invitrogen) 
and the resultant small cell clusters (50-200 cells) were subsequently re-plated on 
tissue culture dishes coated overnight with growth-factor-reduced matrigel (BD 
Biosciences). ELF] naive human ES cells were obtained from C.W. and cultured as 
previously described’, with 10ngml~' human recombinant LIF (R&D). Cell 
cultures were routinely tested and found negative for mycoplasma infection 
(MycoAlert, Lonza). 

ChIP. ChIP assays were performed from approximately 10’ cells per experiment, 
according to previously described protocol with slight modifications’. Briefly, 
cells were crosslinked with 1% formaldehyde for 10 min at room temperature and 
formaldehyde was quenched by addition of glycine to a final concentration of 
0.125 M. Chromatin was sonicated to an average size of 0.5-2 kb, using Bioruptor 
(Diagenode). 50-75 ll of protein G dynal beads (Invitrogen) were used to capture 
3-5 ug of antibody in phosphate citrate buffer pH 5.0 (2.4 mM citric acid, 5.16 mM 
Na HPO,) for 30 min at 27 °C. Antibody bead complexes were rinsed two times 
with PBS and added to sonicated chromatin and rotated at 4 °C overnight. Ten per 
cent of chromatin was reserved as ‘input’? DNA. Magnetic beads were washed and 
chromatin eluted, followed by reversal of the crosslinkings and DNA purification. 
Resultant ChIP DNA was dissolved in TE. 

Flow cytometry. Cells were trypsinized and analysed on a CS&T-calibrated BD 
FACS Aria II SORP flow cytometer on a 561 nm laser line for turboRFP, with 582/ 
15BP. For IFITM1 flow cytometry, cells were allowed to recover after trypsiniza- 
tion for 2h at 37 °C in media. Then, 2.5 X 10° cells were washed with PBS/10% 
FBS/0.1% sodium azide and stained with 1:100 IFITM1 antibody (rabbit pAb, 
ProteinTech, #50556193) for 30 min at 4°C. Washed cells were then incubated 
with chick, anti-mouse A647 secondary for 30 min at room temperature. Control 
stainings using rabbit IgG (santa cruz) and anti-mouse A647 were also performed. 
Bisulfite sequencing. EpiTect Plus Bisulphite conversion kit (Qiagen) was used 
to bisulfite convert 1 pg genomic DNA as per manufacturer’s instructions. 
Approximately 20 ng of BS-treated DNA was used as a template for 35-40 cycles 
with Platinum taq (Invitrogen, 10966) as per manufacturer’s instructions. A-tailed 
PCR fragments were gel purified and inserted into pGEM-T. 5’ LTR provirus= 
specific BS-PCR was conducted with primers including NcoI and NotI sites to 
facilitate cloning into pGEM-T. Approximately 15 clones were Sanger sequenced 
for both forward and reverse strands. BiQ software was used to align and quantify 
CpG methylation. 

Protein extraction and immunoblotting. Proteins were extracted using prev- 
iously described protocols”. Briefly, cells were resuspended in buffer A (10 mM 
HEPES, pH 7.9, 10mM KCl, 1.5mM MgCl, 0.34M sucrose, 10% glycerol) and 
fresh protease inhibitors (Complete EDTA-free, Roche), 1 mM PMSF and 0.1% 
Triton-X 100 were added. Cytoplasmic extract was further clarified by centrifu- 
gation at 13,000r.p.m. at 4°C for 10 min, and total protein concentration was 
assayed with Bradford reagent (Biorad). Equal amounts of protein were run on 
SDS-PAGE gels and then transferred onto Hybond ECL membranes 
(Amersham). Membranes were blocked using 5% milk, PBS, 0.1% Tween-20 for 
1h at 27°C. Primary antibodies (see Supplementary Table 10) were used in block- 
ing solution overnight at 4 °C. Horseradish peroxidase (HRP)-conjugated second- 
ary antibodies were used and chemoluminescence was assayed using Lumi-light 
plus (Roche). 

qPCR. All primers used in qPCR analyses are shown in Supplementary Table 10. 
qPCR was performed using SensiFAST SYBR No-Rox Kit (Bioline) in a Light 


Cycler 4801] machine (Roche), using technical triplicates. ChIP-qPCR signals were 
calculated as percentage of input and unless, indicated, qRT-PCR signal was 
normalized to 18S rRNA. Standard deviations were measured from the averages 
of the technical repeats for each biological replicates and represented as error bars 
+1 s.d. 

Plasmid and constructs. HERVK LTR5_HS sequence from HERVK-con” was 
cloned upstream of a miniTK promoter driving turboRFP and inserted into piggy- 
back transposon (SystemBio). Motif mutations for OCT4 or SOX2 were produced 
by replacing the respective motif with a NotI site. 2.5 ig of reporter vector along 
with 0.5 wg of piggy-back transposase were transfected into cells using 18 yl lipo- 
fectamine2000 (Invitrogen) in 6-well plates. 400 pg ml! G418 (Amresco) was 
used to select for integrants. Cells were analysed >10 days later to minimize signal 
from non-integrated reporter expression. Complementary DNAs encoding OCT4 
or SOX2 were cloned into pcDNA containing carboxy-terminal or amino- 
terminal Flag—haemagglutinin (HA) tags, respectively. The same LTR regulatory 
regions were cloned into pGL3 firefly luciferase reporters, and constructs were co- 
transfected with Renilla luciferase for perform dual luciferase assays. SV40 pro- 
moter/enhancer firefly luciferase was used a positive control. Transgene constructs 
for Rec expression in NCCIT cells were used with eifla promoter, N-terminal 
Flag-eGFP-tagged Rec cloned into a piggy-back construct with a puromycin 
selectable marker. Control construct using Flag-eGFP alone (vector only control) 
was also used in parallel. Transgene constructs were cotransfected with piggy-back 
transposase plasmid to generate stable lines. Clones were selected and expanded. 
Flag-eGFP-Rec cone #1 has ~30X endogenous expression of rec mRNA (as 
measured by qPCR) and Flag~eGFP-Rec clone #2 has ~ 14 endogenous express- 
ion of rec mRNA (qPCR), data not shown. 

siRNA knockdown. siRNA was generated using baculovirus-produced Giardia 
DICER as described™. Briefly, 1 ug of PCR product was in vitro transcribed using 
Megascript T7 (Ambion) and digested using DICER at 37 °C for 16h. siRNA was 
purified using Purelink RNA mini Kit (Ambion) and the absence of >22 nucleo- 
tides RNA was verified using gel electrophoresis and ethidium bromide staining. 
NCCIT cells were plated onto Matrigel-coated 24-well plates, transfected using 
1.5 pl of RNAi-max (Invitrogen) in optimem (Gibco) with 25 nM siRNA concen- 
trations for 4h before addition of fresh media. siRNA knockdowns were per- 
formed for three consecutive days, cells were harvested 24h after final trans- 
fection. Two independent siRNA pools were generated for OCT4, NANOG and 
SOX2, one each for turboRFP (non-targeting control) and rec, which overlaps the 
env ORF. Primers used to generate double-stranded RNA (dsRNA) templates are 
listed in Supplementary Table 10. 

Human embryo source and procurement. Human embryos were obtained as 
previously described*. Approximately 25 supernumerary human blastocysts from 
successful IVF cycles, subsequently donated for non-stem-cell research, were 
obtained with written informed consent from the Stanford University RENEW 
Biobank. De-identification was performed according to the Stanford University 
Institutional Review Board-approved protocol #10466 entitled “The RENEW 
Biobank’ and the molecular analysis of the embryos was in compliance with 
institutional regulations. Approximately 25% of the embryos were from couples 
that used donor gametes and the most common cause of infertility was unex- 
plained at 35% of couples. No protected health information was associated with 
any of the embryos. 

Human embryo thawing and culture. Human embryos cryopreserved at the 
blastocyst stage were thawed by a two-step rapid thawing protocol using Quinn’s 
Advantage Thaw Kit (CooperSurgical) as previously described**°*. In brief, 
either cryostraws or vials were removed from the liquid nitrogen and exposed 
to air before incubating in a 37°C water bath. Once thawed, embryos were 
transferred to a 0.5 moll’ sucrose solution for 10 min followed by a 0.2 moll~! 
sucrose solution for an additional 10 min. The embryos were then washed in 
Quinn’s advantage medium with HEPES (CooperSurgical) plus 5% serum pro- 
tein substitute (CooperSurgical) and each transferred to a 25 pl microdrop of 
Quinn’s advantage blastocyst medium (CooperSurgical) supplemented with 10% 
serum protein substitute under mineral oil (Sigma). The embryos were cultured 
at 37°C with 6% CO,, 5% O; and 89% N, under standard human embryo 
culture conditions in accordance with current clinical IVF practice. Embryos 
used in this study were DPF 5-6. 

Immunofluorescence. Cells were grown on Matrigel-coated glass coverslips, 
fixed using EM-grade 4% PFA (Electron Microscopy Sciences) for 15 min at 
27°C, washed three times with PBS, blocked and permeablized with 1% BSA, 
0.3% Triton-X 100 in PBS (antibody buffer) supplemented with 5% serum for 
species-matched secondary antibody for 1h at 27°C. Primary antibodies were 
resuspended in antibody buffer and incubated at 4°C overnight. Washes were 
performed three times using 0.1% Triton-X 100 in PBS, and secondary antibodies 
were added for 1h at 27 °C in the dark. Cells were mounted using Prolong-fade 
gold (Invitrogen) with DAPI and imaged on Zeiss LSM 700 confocal. 
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For embryo immunostaining, the zona pellucida (ZP) was removed from each 
embryo by treatment with acidified Tyrode’s solution (Millipore) and ZP-free 
embryos were washed in PBS plus 0.1% BSA and 0.1% Tween-20 (PBS-T; 
Sigma-Alrdrich) before fixation in 4% paraformaldehyde for 20 min at room tem- 
perature. Once fixed, the embryos were washed three times in PBS-T to remove any 
residual fixative and permeabilized in 1% Triton X-100 (Sigma-Aldrich) for 1h at 
room temperature. Following permeabilization, the embryos were washed three 
times in PBS-T and then blocked in 4% of chicken or goat serum in PBS-T over- 
night at 4 °C. The embryos were incubated with primary antibodies in PBS-T with 
1% serum sequentially for 1 h each at room temperature at the following dilutions: 
1:200 OCT4, 1:100 Gag/Capsid. Primary signals were detected using the appropri- 
ate 488- or 647-conjugated Alexa Fluor secondary antibody (Invitrogen) at a 1:250 
dilution at room temperature for 1h in the dark and subsequently DAPI stained. 
Immunofluorescence was visualized by sequential imaging, whereby the channel 
track was switched each frame to avoid cross-contamination between channels, 
using a Zeiss LSM510 Meta inverted laser scanning confocal microscope. The 
instrument settings, including the laser power, pinhole and gain, were kept constant 
for each channel to facilitate semi-quantitative comparisons between embryos. 
DNA demethylation treatment. HEK293 cells were plated on Matrigel-coated 
24-well plates, and treated with 0, 1 or 10 1M 5-aza-2'-deoxyctidine (Calbiochem) 
freshly prepared every 24h. Cells were then transfected with 1jig each of 
pcDNA3.1-OCT4 and pcDNA3.1-SOX2 expression plasmids. Media was changed 
24h later, and cells were harvested 3 days after transfection for RNA analysis. 
Human ES cells (H9) were grown as described earlier, except mTeSR was suppl- 
mented with Rock inhibitor (y-27632, Sigma) at 5 1M, and treated with 0, 1 or 
10 uM 5-aza-2'-deoxyctidine (Calbiochem) for 24h. 

RNA-seq library construction. Libraries were constructed as described”’, using 
~10 ug of total RNA followed by poly-A selection with oligo-dT beads, ligation 
and ten cycles of PCR with NEBnext kit oligonucleotides, and sequenced using 
Illumina Hi-Seq2000 at the Stanford Sequencing Facility or ELIM Bio. 
Sequence analysis. For RNA-seq repeat analysis of data from embryo and human 
ES cell libraries (for Fig. 1 and Extended Data Fig. 1), FASTQ files were aligned to 
repbase consensus sequences (downloaded from RepBase) with bowtie using the 
command “bowtie -q -p 8 -S -n 2 -e 70 -1 28-maxbts 800 -k 1 -best”. These bowtie 
parameters ensure that only the best alignment (highest scores) is reported, fur- 
thermore only one alignment per read is reported, that is, these settings do not 
allow multiple-matching. For Fig. 2a analysis of HERVK proviruses, RNA-seq 
reads were aligned to hg19 using the same parameters described earlier, and the 
overlap between the manually curated HERVK provirus data set* is reported. For 
RefSeq analysis for RNA-seq libraries generated for this paper (ELF1 naive or 
primed human ES cells; from human EC cell siRNA RNA-seq, or Rec-hECC 
versus wild-type human EC cell experiments), reads were processed using 
DNAnexus software to obtain read counts and RPKM. Reads were counted and 
where indicated normalized to repeat length and library size using RPKM. 
Differential expression in RNA-seq experiments described earlier was performed 
using DESeq, with reported FDR using Benjamini-Hochberg correction. 
Interferon-induced gene set analysis. Genes were defined as interferon induced 
if they were induced fivefold in interferon-treated cells/tissues for experimentally 
deposited data sets found in Interferome database” (http://interferome.its.mona- 
sh.edu.au/interferome/home.jspx). 

LTR5HS-associated gene analysis. RefSeq genes were classified as associated or 
not associated with LTR5HS (downloaded from UCSC genome browser table) 
using Great Analysis Software (Bejerano laboratory, Stanford University) with a 
cut-off of 100 kb distance from the TSS. These classified RefSeq genes were then 
compared using the RPKM and DESeq analysis as described earlier. Differential 
enrichment of LTR5HS-associated transcripts in naive/primed upregulated versus 
naive/primed downregulated was analysed using non-paired Wilcoxon test, and 
significance is reported at P value < 0.05. Higher average naive/primed RPKM of 
LTRS5HS-associated versus non-LTR5HS-associated genes was tested using non- 
paired Wilcoxon test. 

Chimaeric transcript identification. One-hundred base-pair paired-end RNA- 
seq reads generated with ELF1 naive versus primed human ES cells (see earlier) 
were analysed using a published pipeline”. Briefly, Cufflinks software was used to 
perform de novo identification of transcript models. These transcript models were 
then used to identify splice junctions in which one side of the transcript model 
overlapped the GTF file (for hg19 from UCSC), cataloguing known genes and long 
noncoding RNAs (lincRNAs), and the other side of the transcript model aligned to 
hg19 classified as a repeat (UCSC genome browser, repeat track). Transcripts that 
fulfilled these criteria were classified as chimaeric transcripts, and are reported in 
Supplementary Table 1. 

Clustering. Hierarchical clustering was performed using Gene-e software (http:// 
www.broadinstitute.org/cancer/software/GENE-E/index.html) using K-means 
clustering of log>-transformed RPKM. 
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Statistical tests. A list of the statistical tests, multiple-hypothesis testing correc- 
tions, and normality criteria for parametric tests is reported in Supplementary 
Table 7. 

Electron microscopy. Samples were fixed using 4% PFA and 0.01% glutaralde- 
hyde for 15min at 27°C. Routine heavy metal staining was conducted where 
indicated. Immuno-TEM with 1:100 dilution of anti-HERVK Gag/Capsid using 
overnight incubation at 4 °C and labelling was visualized using 5 nm gold-labelled 
anti-mouse secondary antibody. Secondary only controls demonstrated specificity 
of the antibody for this application. TEM was performed at the Electron 
Microscopy core at Stanford University using a Jeol JEM-1400 electron micro- 
scope. 

iCLIP and data analysis. The iCLIP method was performed as described before 
with the specific modifications below’*. Flag~-GFP-Rec (FG-Rec)-expressing NCC 
cells were UV-C crosslinked to a total of 0.3 J cm~. Each iCLIP experiment was 
normalized for total protein amount, typically 1 mg, and partially digested with 
RNasel (Life Technologies) for 10 min at 37 °C and quenched on ice. FG-Rec was 
isolated with anti-Flag agarose beads (Sigma) for 3 h at 4°C on rotation. Samples 
were washed sequentially in 1 ml for 5 min each at 4 °C: 2 high stringency buffer 
(15 mM Tris-HCl pH 7.5, 5 mM EDTA, 2.5mM EGTA, 1% Triton X-100, 1% Na- 
deoxycholate, 120 mM NaCl, 25 mM KC)), 1X high salt buffer (15 mM Tris-HCl 
pH7.5,5mM EDTA, 2.5mM EGTA, 1% Triton X-100, 1% Na-deoxycholate, 1M 
NaCl), 1X NT2 buffer (50mM Tris-HCl pH 7.5, 150mM NaCl, 1mM MgCh, 
0.05% NP-40). Purified FG-Rec was then eluted from anti-Flag agarose beads 
using competitive Flag peptide elution. Each sample was resuspended in 500 pl 
of Flag elution buffer (50 mM Tris-HCl pH 7.5, 250 mM NaCl, 0.5% NP-40, 0.1% 
Na-deoxycholate, 0.5 mg ml”! Flag peptide) and rotated at 4°C for 30 min. The 
Flag elution was repeated once for a total of 1 ml elution. FG-Rec was then cap- 
tured using anti-GFP antibody (Life Technologies, A-11122) conjugated to 
Protein A dynabeads (Life Technologies) for 3 h at 4 °C on rotation. Samples were 
then washed as described previously in the anti-Flag agarose beads. 3’-End RNA 
dephosphorylation, 3’-end single-stranded RNA (ssRNA) ligation, 5’ labelling, 
SDS-PAGE separation and transfer, autoradiograph, RNP isolation, Proteinase K 
treatment and overnight RNA precipitation took place as previously described**. 
The 3’-ssRNA ligation adaptor was modified to contain a 3’ biotin moiety as a 
blocking agent. The iCLIP library preparation was performed as described else- 
where**”’. Final library material was quantified on the BioAnalyzer High 
Sensitivity DNA chip (Agilent) and then sent for deep sequencing on the 
Illumina HiSeq 2500 machine for 1X 75 bp cycle run. iCLIP data analysis was 
performed as previously described”. For analysis of repetitive noncoding RNAs, 
custom annotation files were built from the Rfam database. For analysis of endo- 
genous retroviral elements, custom annotation files were built from the repbase 
database. iCLIP reads were filtered for quality, barcode split, PCR-duplicate 
removed, trimmed (5’ and 3’ends), and mapped for unique matches under para- 
meters previously described**”’. Bioinformatic pipeline used for iCLIP data ana- 
lysis is described previously”. Briefly, RT stops were used to map nucleotide 
resolution of Rec binding, and only nucleotides supported with three independent 
RT stops in two replicates (with at least one RT stop in each replicate) were 
reported as binding events, and are reported in Supplementary Table 3. 
Ribosome profiling. Human EC cells (NCCIT) were cultured as described earlier. 
Total RNA was extracted using Trizol (Life Technologies) and used as input 
material for the ARTseq Ribosome Profiling Kit—Mammalian (Epicentre) follow- 
ing the manufacturer’s protocol with the following modifications. The 3’ RNA 
ligation adaptor and cDNA synthesis primers from the iCLIP protocol were for 
library construction. Final library material was quantified as in the iCLIP experi- 
ments and sequenced on the Illumina HiSeq 2500 machine for 1X 75 bp cycle run. 
Sequencing reads were preprocessed (quality filter, PCR duplicate removal, and 
trimming) as in the iCLIP protocol. Mapping was performed using an established 
pipeline previously described”. Briefly, reads were aligned to 45s rDNA repeat 
sequence with bowtie to remove residual rRNA reads from libraries. Non-aligning 
reads (non-rRNA) were then aligned to hg19 with TopHat2 and differential 
expression was identified using default parameters for CuffDiff/Cufflinks software 
with significance at FDR < 0.05. 

Influenza infection experiments. Human EC cells (NCCIT) were plated in 
duplicate (1.5 X 10° cells per well) on a 96-well flat-bottom plate in 100 ll Virus 
Diluent (DMEM, Gibco, supplemented with 1% BSA, 1X antibiotics and 20 mM 
HEPES). Cells were incubated at 37 °C and 5% CO} for 1.5 h. Wild-type human EC 
cells and REC-hECCs were then infected with virus (influenza A/H1N1/PR8/ 
1934, diluted 1:10 into 100 ll virus diluent, increasing total volume to 200 ul). 
Cells were incubated at 37°C for 1h. FBS (Hyclone) was added to the wells 
to a final concentration of 10% FBS. Cells were incubated at 37°C for 5h. 
20mM EDTA (2011) was added to all wells and mixed thoroughly to stop 
infection. Cells were washed with 200 ul 1X PBS (Hyclone), re-suspended in 
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100 yl 1X BD FACS Lysing Solution (BD Biosciences) and stored at —80 °C for 
later processing. 

For staining and analysis, cells were thawed in 37 °C for 20 min. One-hundred 
microlitres FACS wash (1X HyClone DPBS with 2% FBS) was added to each well 
and plate was centrifuged. Cell pellets were re-suspended in 200 pl BD FACS 
Permeabilizing Solution II (BD Biosciences). Cells were incubated at room tem- 
perature in the dark for 10 min. Plate was centrifuged and cells were washed twice 
with 200 pl FACS wash. Cells were stained with primary antibody (mouse anti- 
influenza A nucleoprotein, C43 clone, Abcam) diluted to 2 ng ml~1. Cells were 
incubated in the dark at room temperature for 30 min and washed twice. Cell 
pellets were resuspended in 2 jg ml! of secondary antibody (chicken anti-mouse 
Alexa647, Invitrogen) in 50 1 FACS wash and incubated in the dark at room 
temperature for 30 min. Cells were washed twice and cell pellets were resuspended 
in 1% PFA (Electron Microscopy Sciences). Cells were analysed on the MACSQuant 
Analyzer (Miltenyi Biotec). MACSQuant Calibration Beads (Miltenyi Biotec) were 
used for calibration of the cytometer. Compensation controls were run using 1:1 
mixture of CompBead Plus Anti-mouse Igk (BD) and negative control beads. Single 
stained cellular controls were run in parallel to infected and uninfected samples. 
Data were analysed by FlowJo 9.7.6 (TreeStar). Cells were gated to exclude dead cells 
and debris. Infection levels were background subtracted using uninfected wells, and 
normalized to infection levels in GFP-hECC for each run. 

RNA-seq data sets. Data sets used in this study can be accessed from: Array 
Express Database (accession number E-MATB-2031)"; Gene Expression 
Omnibus (accession number GSE36552)'°; Gene Expression Omnibus (accession 
number GSE44183)"; Array Express (accession number E-MTAB-2857)”. 
Sequencing data sets generated for this study are deposited under in the Gene 


Expression Omnibus under accession number GSE63570, and are summarized in 
Supplementary Table 8. 
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Extended Data Figure 1 | Additional single-cell RNA-seq data analyses 
from preimplantation human embryos (supporting Fig. 1). a, Heat map and 
hierarchical K-means clustering of highly expressed (average RPKM > 6 across 
89 embryo libraries) repetitive elements in single cells of human 
preimplantation embryos at indicated developmental stages (top) and HERVK 
expression (bottom) using indicated data sets. b, HERVH expression (RPKM) 
in single cells of human embryos at indicated preimplantation stages. Solid 


line indicates mean. RNA-seq data are taken from ref. 10. c, HERVH 
expression (RPKM) in single cells of human blastocysts, grouped by lineage. 
Solid line indicates mean. Oocyte (n = 3), zygote (n = 3), 2-cell (n = 6), 4-cell 
(n = 11), 8-cell (n = 19), morula (n = 16), TE (n = 18), PE (n = 7), EPI 

(n =5), pO (n = 8), p10 (n = 26). RNA-seq data set was from ref. 10. 

d, Genome browser snapshot showing 100 bp PE-RNA-seq reads from ELF1 
naive human EScells aligning at the HERVK 108 provirus on chromsome 7. 
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Extended Data Figure 2 | LTR5 alignments, HERVK expression data in cell 
lines, and control ChIP-qPCR analyses in primed human ES cells 
(supporting Fig. 2). a, Top, presence of HERVK(HML-2) sequences in Old 
World primates, but absence in New World primates. Middle, schematic of 
HERVK proviral genome; all human-specific insertions contain LTR5HS. 
Bottom, phylogenetic relationship of HERVK LTR subclasses showing high 
degree of sequence similarity. Pro, protease; Pol, polymerase; Gag, group- 
specific antigen; Env, envelope. Bottom, ClustLW multiple sequence alignment 
of indicated HERVK LTR sequences (top), region around OCT4 motif is boxed, 
phylogenetic tree (bottom) indicating presence/absence of OCT4 motif. 

b, HERVK protein expression in human EC cells and human ES cells. Protein 
extracts from human EC cells (NCCIT) and human ES cells (H9) were analysed 
by immunoblotting with an antibody detecting HERVK Gag precursor and the 
processed Capsid (top), or the glycosylated, unprocessed form of the HERVK 
envelope protein Env (bottom). Tata-binding protein (TBP) was used as a 
loading control. Shown is a representative result of three independent 


experiments. c, RT-qPCR analysis of HERVK RNA expression in human EC 
cell line NCCIT, human ES cell line H9, and HEK293 cells. Three distinct qPCR 
amplicons, corresponding to env, gag and pro are shown. Samples were 
normalized to 18S ribosomal RNA levels. *P value < 0.05, one-sided t-test. Error 
bars are +1 s.d., n = 3 biological replicates. d, HERVK gag or env expression in 
male human ES cell lines HSF-1, HSF-8, female human ES cell line H9 and 
human EC cell line NCCIT. *P value < 0.05, one sided t-test compared to 
control siRNA, n = 3 biological replicates. Error bars are +1 s.d. e, RT-qPCR 
analysis of HERVK transcripts after siRNA knockdown of NANOG, OCT4 or 
SOX2 in human EC cells (NCCIT). Signals were normalized to 18S rRNA. *P 
value < 0.05, one sided t-test compared to control siRNA, n = 3 biological 
replicates. Error bars are +1 s.d. f, ChIP-qPCR analyses of human ES cells (H9) 
with indicated antibodies. Signals were interrogated with primer sets for positive 
control regions (active human ES cell OCT4 and SOX2 enhancers), LTRSHS, or 
non-repetitive, intergenic negative regions, as indicated at the bottom. Shown is 
a representative result of two biological replicates. 
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Extended Data Figure 3 | HERVK regulation by OCT4 and DNA 
methylation (supporting Fig. 2). a, Transcription factor knockdown in human 
EC cells (NCCIT). Cells were transfected with siRNA pools targeting indicated 
transcription factors and protein depletion was measured by immunofluoresence 
with respective antibodies in comparison with control, mock-transfected cells. 
DAPI (blue), OCT4 (green, left), NANOG (green, middle), SOX2 (green, right). 
Shown is one of three representative fields of view at X20 magnification. b, Dual 
luciferase assays with indicated reporter constructs in human EC cells (NCCIT) 
showing that mutation of OCT4 site decreases reporter activity. N = 3 biological 
replicates, error-bars +1 s.d. *P value < 0.05, one-sided t-test. SV40 enhancer/ 
promoter construct was used as a positive control. c, Bisulfite sequencing for 


indicated cell types (WT33 human IPSC) analysing consensus LTR5HS-specific 
amplicon as in Fig. 2e. d, Bisulfite sequencing analysis of HERVK proviral 
consensus amplicon containing 3’ end of LTR, primer binding site, and 5’ region 
of Gag ORF (see Extended Data Fig. 2a) in indicated cell types: ELF1 naive, 
human ES cell, WT33 human IPSC, NCCIT human EC cell, or H9 human ES cell. 
e, RT-qPCR analysis of HERVK RNA levels in HEK293 cells treated with 
indicated concentrations of 5-aza-2'-deoxycytidine for 3 days, followed by 
transfection with OCT4/SOX2 expression constructs and RNA collection 48 h 
after transfection. qPCR primer sets were designed to three independent 
amplicons of HERVK. *P value < 0.05, one-sided t-test. n = 4 biological 
replicates, error bars +1 s.d. 
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Extended Data Figure 4 | HERVK Gag/Capsid antibody validation and 
staining (supporting Fig. 3). a, Immunofluorescence analysis of human EC 
cells (NCCIT) and human ES cells (H9) stained with DAPI (blue), OCT4 
(green), Gag/Capsid (red), or IgG control (bottom). White boxes indicate 
regions shown in higher magnification/merge (right). Shown are representative 
fields of three independent experiments. b, Sensitivity of HERVK Gag/Capsid 
antibody immunoblot signal to HERVK knockdown. Human EC cells were 
transfected with one of three independent siRNA pools targeting HERVK Gag or 
with a control, non-targeting pool (synthesized against RFP) and total protein 
was analysed by immunoblotting with anti-Env and anti-Gag/Capsid antibodies. 
1:2 serial dilution of total protein was loaded, as indicated. Blots were stripped 


and re-probed with TBP as a loading control. Shown is a representative result of 
two independent experiments. c, Sensitivity of HERVK Gag/Capsid antibody 
immunofluorescence signal to siRNA knockdown of Gag/Capsid (top) or 
control siRNA targeting RFP (bottom). Shown is a representative result of three 
fields of view. Magnification: 20X d, Immunoflourescence of naive ELF1 human 
ES cells with antibodies against OCT4 (green), HERVK Gag/Capsid (pink), 
DAPI in blue. Region marked with white box on left is shown with larger 
magnification (bottom). Magnification = 20x, 40x respectively. e, Another 
representative example of immunoflourescence of human blastocysts with 
DAPI (blue), OCT4 (green) and Gag/Capsid (red) shown (n = 19 blastocysts; 
DPF 5-6). Original magnification, 40. 
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Extended Data Figure 5 | TEM analyses of human EC cells and control 
embryo staining (supporting Fig. 3). a, TEM analysis of human EC cells 
(NCCIT) with heavy metal staining; arrow indicates VLPs. Boxed region is 
shown with higher magnification in an inset. Scale bar = 500 nm. Shown is a 
representative example of two independent experiments. b, TEM immuno- 
gold labelling of human EC cells (NCCIT) with Gag/Capsid antibodies. Shown 
is a representative example from two independent experiments. c, Secondary 


antibody only control for immuno-gold labelling of human blastocysts. Shown 
is a representative example from eight fields of view. d, Model figure 
summarizing HERVK transcriptional regulation in human embryos and in 
vitro cultured pluripotent cells. Dashed lines indicate inference of OCT4, DNA 
methylation and HERVK level changes at implantation from those observed 
between naive and primed human ES cells, in the absence of data from actual 
postimplantation human embryos. 
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Extended Data Figure 6 | Correlation of HERVK LTR5HS elements with 
gene expression (supporting Fig. 4). a, Number of splice junctions identified 
linking indicated HERV class to annotated ReqSeq genes. Analysis was done 
using RNA-seq data set from ELF1 naive human ES cells, n = 3 biological 
replicates. b, Number of reads supporting chimaeric transcripts from indicated 
HERV class in ELF1 naive human ES cells, n = 3 biological replicates. 

c, Expression of LTR5HS linked genes plotted as a function of distance to the 
gene’s transcription start site (TSS). x-axis: distance of TSS to the nearest 
LTRSHS in kb; y-axis: fold change in expression of the linked gene in ELF1 
naive versus primed human ES cells (this study, left) or expression of the linked 
gene in 3iL versus primed H1 human ES cells (right, ref. 12). d, Top, histograms 
showing expression ofall genes that significantly change in expression between 
naive and primed ELF1 human ES cells (top histogram, white) or significantly 
changed genes that are LTR5HS associated (bottom histogram, blue); 
expression values from naive versus primed ELF1 human ES cell RNA-seq data 


sets (FDR < 0.05 DESeq). Fischer’s exact test gives stated P value, indicating 
enrichment of LTR5HS-linked genes in naive upregulated category. Bottom, 
quantification of average expression of LTR5HS-linked (blue) or unlinked 
(white) genes. Non-paired Wilcoxon test with stated P value indicating that 
genes linked to 1 or more LTR5HS have significantly higher mean expression. 
e, Top, histograms showing expression of all genes that significantly change in 
expression between 3iL and primed H1 human ES cells (top histogram, white) 
or significantly changed genes that are LTR5_HS associated (bottom 
histogram, blue); expression values from RNA-seq data sets reported 
previously'’*, FDR < 0.05 DESeq. Fischer’s exact test gives stated P value 
indicating enrichment of LTR5HS-linked genes in naive upregulated category. 
Bottom, quantification of average expression of LTR5HS-linked (blue) or 
unlinked (white) genes. Non-paired Wilcoxon test with stated P value 
indicating that genes linked to 1 or more LTR5HS have significantly higher 
mean expression. 
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Extended Data Figure 7 | rec and IFITMI expression in naive human ES 
cells, and effect of Rec expression on H1N1(PR8) infection 

(supporting Fig. 4). a, Left, RT-qPCR analysis of HERVK rec expression 
levels in ELF1 naive human ES cells (n = 3 biological replicates) or H9 
primed human ES cells (one biological replicate). Normalized to 18S rRNA. 
Right, Rec RNA levels in indicated blastocyst lineages. Solid 

line indicates mean; data are from ref. 10. b, RNA-seq quantification of 
IFITM1 RNA levels in naive or primed ELF1 human ES cells (left, this study) 
or 3iL human ES cells versus primed H1 human ES cells from ref. 12 (right). 
n = 3 biological replicates for each condition, error bars are +1 s.d. Asterisk 
indicates significance at FDR < 0.05, DESegq. c, Flow cytometry for surface- 
localized IFITM1 staining in the indicated H9 human ES cells or naive ELF1 
human ES cells (top) or, as a control for IFITM1 antibody specificity, 
knockdown of IFITM1 with two independent IFITM1 siRNA pools 
compared to control siRNA-treated cells in Flag-eGFP-Rec-hECCs 
(bottom). d, Left, IFITM1 expression in control human EC cell versus 


Rec-hECC (NCCIT) RNA-seq data sets. n = 2 biological replicates. 
Significance = FDR < 0.05, DESeq. Right, IFITM1 expression in control 
siRNA versus Rec siRNA-treated human EC cells (NCCIT) RNA-seq. n = 3 
biological replicates, error bars are +1 s.d. Significance = FDR < 0.05, 
DESeq. e, Flow-cytometry profiles for indicated cell types in H1N1(PR8) 
infected (top) or non-infected (bottom) wild-type (WT) control human EC 
cells or Flag-GFP-Rec-hECCs, clone #1. Shown is one representative 
example of four independent experiments showing a co-plating experiment 
in which GFP-Rec cells and wild-type control (GFP negative) cells are 
infected in the same well, stained in the same tube and identified by GFP 
fluorescence after gating for FSC and SSC. f, Scatterplot of ELF1 naive versus 
primed human ES cell RNA-seq showing all interferon-induced genes, with 
differentially regulated genes (FDR < 0.05 DESeq, n = 3 biological 
replicates each) highlighted in red. There is a significant overlap between 
differentially regulated genes and interferon-induced genes as measured bya 
hypergeometric test (P value < 0.05). 
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Extended Data Figure 8 | iCLIP analysis of Rec-associated RNAs 
(supporting Fig. 4). a, Diagram of iCLIP-seq procedure (see Methods for 
details). Briefly, cells are crosslinked using ultraviolet, lysed and digested 
with RNase to trim RNAs. Sequential immunopurification is performed 
using Flag M2, peptide elution, and GFP immunoprecipitation (IP). After 
stringent washing, RNAs are recovered and either radiolabelled (shown in 
b) or reverse transcribed and prepared for Illumina HTPS libraries. 

b, Autoradiogram of labelled RNAs (top) recovered from ultraviolet- 
crosslinked cells using sequential Flag-eGFP immunoprecipitation from: 
wild-type human EC cells (lanes 1, 2), Flag-eGFP control human EC cells 
(lanes 3, 4), or two independent Rec-hECC transgenic lines (lanes 5-8), 
separated on an SDS-polyacrylamide gel electrophoresis (SDS-PAGE) gel. 
Free Rec protein runs as a ~35 kDa band, while Rec protein crosslinked to 


RNA molecules show lower electrophoretic mobility. Please note that: (1) 
Rec-bound RNAs are resistant to even high concentrations of RNasel, 
probably indicating extensive secondary RNA structures, and (2) low/no 
background of contaminating RNAs in control immunoprecipitation from 
wild-type human EC cells or Flag—eGFP control human EC cells. Western 
blots with anti-GFP antibody were also performed to confirm the presence 
of tagged protein in Flag—eGFP control and Flag~eGFP-Rec cells, both in 
input and immunoprecipitation fractions (middle). HSP90 was used as a 
loading control (bottom). c, Computationally predicted (using mFold) 
secondary structure of LTR5HS sequence around the Rec response element 
(identified experimentally in vitro previously”’). Single nucleotide resolution 
Rec ultraviolet-crosslinking sites determined by iCLIP are shaded in red; 

n = 2 biological replicates. 
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Extended Data Figure 9 | Rec target mRNA analysis (supporting Fig. 4). 
a, Genome browser representations of the Rec iCLIP read (n = 2 biological 
replicates) distribution at indicated mRNA targets. b, Computationally 
predicted (using mFold) secondary structures of indicated Rec iCLIP-seq 
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targets. Single-nucleotide resolution Rec ultraviolet-crosslinking sites 
determined by iCLIP are shaded in red; to orient the reader, browser 
representation of the folded fragment is shown above each respective 
cartoon. 
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Extended Data Figure 10 | Model of HERVK regulation and function. 
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Epicardial regeneration is guided by cardiac outflow 
tract and Hedgehog signalling 


Jinhu Wang"*, Jingli Cao'*, Amy L. Dickson! & Kenneth D. Poss! 


In response to cardiac damage, a mesothelial tissue layer envel- 
oping the heart called the epicardium is activated to proliferate 
and accumulate at the injury site. Recent studies have implicated 
the epicardium in multiple aspects of cardiac repair: as a source of 
paracrine signals for cardiomyocyte survival or proliferation; a 
supply of perivascular cells and possibly other cell types such as 
cardiomyocytes; and as a mediator of inflammation’ ’. However, 
the biology and dynamism of the adult epicardium is poorly under- 
stood. To investigate this, we created a transgenic line to ablate the 
epicardial cell population in adult zebrafish. Here we find that 
genetic depletion of the epicardium after myocardial loss inhibits 
cardiomyocyte proliferation and delays muscle regeneration. The 
epicardium vigorously regenerates after its ablation, through pro- 
liferation and migration of spared epicardial cells as a sheet to 
cover the exposed ventricular surface in a wave from the chamber 
base towards its apex. By reconstituting epicardial regeneration ex 
vivo, we show that extirpation of the bulbous arteriosus—a dis- 
tinct, smooth-muscle-rich tissue structure that distributes outflow 
from the ventricle—prevents epicardial regeneration. Conversely, 
experimental repositioning of the bulbous arteriosus by tissue 
recombination initiates epicardial regeneration and can govern 
its direction. Hedgehog (Hh) ligand is expressed in the bulbous 
arteriosus, and treatment with a Hh signalling antagonist arrests 
epicardial regeneration and blunts the epicardial response to mus- 
cle injury. Transplantation of Sonic hedgehog (Shh)-soaked beads 
at the ventricular base stimulates epicardial regeneration after 
bulbous arteriosus removal, indicating that Hh signalling can sub- 
stitute for the influence of the outflow tract. Thus, the ventricular 
epicardium has pronounced regenerative capacity, regulated by the 
neighbouring cardiac outflow tract and Hh signalling. These find- 
ings extend our understanding of tissue interactions during regen- 
eration and have implications for mobilizing epicardial cells for 
therapeutic heart repair. 

To assess the homeostatic properties of the epicardium, we used an 
inducible cell ablation system in adult zebrafish. Targeted expression of 
bacterial nitroreductase (NTR) depletes specific cell types via conver- 
sion of a non-toxic substrate, metronidazole (Mtz), to a cytotoxin’*"”. 
We used tcf21 regulatory sequences, which in zebrafish drive the most 
widespread epicardial expression of known DNA elements’, to create 
an NTR transgenic line for lesioning this tissue without direct myo- 
cardial damage. After treatment of adult tcf21:NTR; tcf21:nuceGFP 
animals with Mtz, ~90% of enhanced green fluorescent protein 
(eGEP)* epicardial nuclei on average were ablated from the ventricular 
surface in large patches (Fig. 1a, b, f). 

To determine whether epicardial depletion affects the well-docu- 
mented capacity of the zebrafish heart to regenerate’, we transiently 
incubated tcf21:NTR zebrafish with Mtz after resection of the vent- 
ricular apex. Mtz treatment reduced epicardial cell number in the 7 
days post-amputation (dpa) injury site by ~45%, while reducing car- 
diomyocyte proliferation indices by ~33% (Fig. 1c, d and Extended 
Data Figs la, b, 3c). Myofibroblasts were represented similarly in 


vehicle- and Mtz-treated clutchmates by 14 dpa (Extended Data 
Fig. 1c). Injured ventricles of Mtz-treated animals displayed reduced 
vascularization and muscularization by 30 dpa (Fig. le and Extended 
Data Fig. 1d, e), associated with fibrin and collagen retention (Fig. le). 
By 60 dpa, ventricles from Mtz-treated zebrafish consistently showed 
normal muscularization and a large complement of tcf21:nuceGFP* 
cells, along with minor collagen deposits (Extended Data Fig. 1f). Thus, 
depletion of epicardial tissue inhibits cardiomyocyte proliferation and 
vascularization after resection injury, reducing the efficacy of heart 
regeneration. 

These experiments suggested a high capacity of epicardial cells to 
regenerate after major depletion. To test this directly, we examined 
otherwise uninjured hearts at different times after epicardial ablation. 
Ventricular epicardial cells typically have a low proliferation index 
(Extended Data Fig. 2a). However, within 3 days of Mtz treatment 
(3 days post-incubation (dpi)), many spared epicardial cells entered 
the cell cycle (Extended Data Fig. 2b, c). At 7 dpi, ventricles displayed 
quantifiable epicardial recovery that was more prominent at the cham- 
ber base (Fig. 1b). By 14 dpi, and as early as 7 dpi, ventricles were fully 
covered to their apices with tcf21:nuceGFP* epicardial cells (Fig. 1b, f). 
The temporal variation in recovery probably reflects variation in the 
location/pattern of epicardial cells spared by ablation among clutch- 
mates, or in chamber size (Extended Data Fig. 3a). To examine the 
origins of the regenerated epicardium, we used inducible Cre-based 
genetic fate-mapping to label tcf21-expressing cells and their progeny 
permanently before injury’. Labelling and subsequent fate-mapping 
experiments indicated that pre-existing epicardial cells, and not a 
tcf21-negative precursor, are a primary source for regeneration 
(Fig. 1g, h). In sum, these experiments reveal that adult epicardium 
regenerates after substantial genetic ablation, through a mechanism of 
expansion by spared epicardial cells. 

To expand our range of experimental manipulations, we refined 
protocols such that freshly dissected hearts contracted for several 
weeks ex vivo (Supplementary Video 1)'**. When Mtz was added 
transiently to the culture medium for 1 day, ventricular epicardial cells 
were potently ablated. Epicardial layers of the atrium and the bulbous 
arteriosus (alternatively referred to as the outflow tract) were less 
effectively depleted (Fig. 2a), probably owing to differential expression 
of the NTR transgene among cardiac chambers (Extended Data 
Fig. 3b). Daily imaging of these hearts confirmed observations from 
in vivo experiments, demonstrating regeneration of the epicardium 
from base to apex that was typically completed in 2 weeks (Fig. 2a). 
Hearts from animals given partial ventricular resections in vivo showed 
a similar pattern of epicardial regeneration after ex vivo ablation 
(Extended Data Fig. 4a). Cardiac muscle regeneration was ineffective 
in explanted hearts in our experiments. Increases in cell number 
occurred concomitantly with movement across the myocardial surface 
during epicardial regeneration, with spared epicardial cell patches away 
from the leading edge eventually incorporated into the sheet (Fig. 2a). 

To identify possible intrinsic differences in epicardial cells from 
different ventricular regions, we examined behaviours of basal or 
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Figure 1 | Epicardial ablation and regeneration. a, Adult zebrafish heart. 
OFT, outflow tract. b, tcf21:NTR; tcf21:nuceGFP adults were incubated with 
Mtz or vehicle, and hearts were collected by random sampling at 3, 7 or 14 dpi. 
White dashed lines delineate ventricle. Numbers in bottom right corners are 
proportion of total animals with indicated phenotype. All 3 dpi ventricles 
showed major ablation, averaging ~90% loss. c, Left, schematic for tests of 
epicardial ablation on muscle regeneration. Right, ventricular cardiomyocyte 
proliferation at 7 dpa. Brackets indicate injury site. Arrowheads indicate 
proliferating cardiomyocytes. DAPI, 4’,6-diamidino-2-phenylindole; WT, wild 
type. d, Quantified PCNA* cardiomyocyte indices in injury sites in 
experiments from c. ***P < 0.001, Mann-Whitney rank sum test; n = 18 (wild 
type) and 19 (tcf21:NTR) animals from two experiments. e, Section images of 


apical epicardial tissue patches transplanted to ablated ventricles. In 
these experiments, transplanted cells of either origin consistently repo- 
pulated the ventricular surface in a base-to-apex direction after trans- 
plantation (Extended Data Fig. 5a—d), revealing no proliferative bias in 
ventricular epicardial cells that could explain the directional flow of 
regeneration. To assess potential extrinsic influences on epicardial 
regeneration, we removed the atrium or bulbous arteriosus from its 
attachment at the ventricular base before epicardial ablation. Atrial 
extirpation did not noticeably affect the regeneration of ventricular 
epicardium (Fig. 2b and Supplementary Video 2). By contrast, removal 
of outflow tract tissue blocked epicardial cell recovery, an arrest that 
persisted for at least 2 weeks (Fig. 2c, d and data not shown). To test 
whether this arrest was solely a consequence of mechanical tissue 
disruption, we ablated the epicardium after host bulbous arteriosus 
removal, before grafting a non-transgenic bulbous arteriosus to the 
ventricular base 2 days later. In most of these tissue recombination 
procedures (13 of 21), host tcf21:nuceGFP* epicardium regenerated to 
cover the ventricle (Fig. 3a). This effect was not observed when a 
portion of donor ventricular apex was inverted and transplanted to 
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ventricles at 30 dpa, assessed for muscle recovery (MHC) and scar indicators 
(fibrin, collagen). One of eleven tcf21:nuceGFP and 8 of 12 tcf21:NTR; 
tcf21:nuceGFP ventricles showed myocardial gaps. Dashed line indicates 
approximate resection plane. **P < 0.01, Fisher Irwin exact test. f, Quantified 
eGFP* nuclei from experiments in b. ***P < 0.001, Student’s two-tailed t-test. 
g, Left, CreER-based strategy for permanent labelling of tcf21* progeny. Right, 
section images of lineage-labelled eGFP” epicardial progeny up to 14 dpi, 
indicating derivation from pre-existing epicardium. Arrows indicate eGEP* 
cells spared by epicardial ablation. h, Quantified eGFP~ cells from experiments 
in g. ***P < 0.001, Student’s two-tailed t-test; n = 10 (vehicle, 3 dpi), 13 (Mtz, 3 
dpi) and 15 (Mtz, 14 dpi). c, g, Insets, high magnifications of boxed areas. Scale 
bars, 50 pm. Error bars indicate standard deviation (s.d.). 


the host ventricular base (Extended Data Fig. 5e). Complementary 
grafting experiments indicated that bulbous arteriosus could contrib- 
ute epicardial cells to the ventricular surface, as a potential supplement 
to expansion of the ventricular epicardial cell pool (Extended Data 
Fig. 5f). Thus, our experiments indicate that outflow tract tissue pro- 
vides an essential interaction for regeneration from existing ventricular 
epicardial cells. 

To test whether outflow tract tissue is sufficient to stimulate epicar- 
dial regeneration, we ectopically positioned experimentally manipu- 
lated cardiac structures. Co-culture of several outflow tracts in a 
transwell assay with an epicardially ablated ventricle did not restore 
regeneration in the absence of host bulbous arteriosus (Extended Data 
Fig. 6a). Similarly, a bulbous arteriosus graft placed at the ventricular 
apex showed no evidence of directing regeneration of basally located 
host epicardial cells towards the apex (Extended Data Fig. 6b). Thus, 
we could not detect bulbous arteriosus effects requiring long-range 
diffusion through tissue or culture medium. Next, we transplanted a 
tcf21:nuceGFP” epicardial cell patch to the apex of an ablated host 
ventricle, after which we grafted a wild-type bulbous arteriosus to the 
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Figure 2 | Cardiac outflow tract is required for regeneration of adjacent 
ventricular epicardium. a, Top, schematic for epicardial ablation and 
regeneration in hearts cultured ex vivo. Bottom, regeneration occurs in a base- 
to-apex direction (arrows). Isolated patches (circled by blue dashed lines) do 
not participate in regeneration until contacted by the leading edge. d, days. 
b, Ventricular epicardium regenerates in the absence of the atrium (n = 19; 
behaviour seen in all samples). Arrows indicate direction of regeneration. 

c, d, Ventricular epicardium fails to regenerate in the absence of the bulbous 
arteriosus. Ventricular epicardium showed defective regeneration in these 
experiments with (c) (n = 6; all samples) or without atrium (d) (n = 14; all 
samples), even when, in rare cases, many basal epicardial cells were spared 
(d). Red dashed lines delineate epicardium or epicardial leading edge. White 
dashed lines delineate ventricle. OFT, outflow tract. Scale bars, 50 um. 


apex (Fig. 3b). Remarkably, the apical bulbous arteriosus was capable 
of stimulating apex-to-base regeneration from the nearby epicardial 
patch in a high proportion (21 of 32) of experiments, effectively revers- 
ing the stereotypical direction of recovery (Fig. 3c, d). Together, these 
experiments indicate that the cardiac outflow tract is necessary and 
sufficient for epicardial regeneration, and that this neighbouring tissue 
provides a short-range influence(s) that directs regeneration from base 
to apex. 

To pursue the molecular nature of interactions between outflow 
tract and ventricular epicardium, we applied a small panel of signalling 
pathway effectors to epicardially ablated hearts cultured ex vivo. 
Among several compounds (Extended Data Fig. 7), the Smoothened 
(Smo) antagonist cyclopamine (CyA) blocked regeneration; nonethe- 
less, regeneration initiated normally after drug washout (Fig. 4a). CyA 
treatment reduced spontaneous epicardial cell 5-ethynyl-2'-deoxyur- 
idine (EdU) incorporation occurring in the first 2 days of explant 
culture, suggesting that intact Hh signalling promotes epicardial pro- 
liferation (Extended Data Fig. 8a, b). CyA also disrupted in vivo epi- 
cardial regeneration, not only in adults but in larvae, an additional 
developmental setting in which we identified base-to-apex regenera- 
tion (Fig. 4b, c and Extended Data Fig. 9a—c). CyA treatment from 2 to 
4 days post-fertilization also reduced the initial epicardial occupancy 
of the larval ventricle (Extended Data Fig. 9d-f), indicating that epi- 
cardial regeneration recapitulates at least one pathway influential in 
morphogenesis. Finally, we observed inhibitory effects of CyA on the 
epicardial proliferative response to muscle resection in vivo, and in 
coverage of these injuries ex vivo (Extended Data Fig. 4b and Extended 
Data Fig. 8c, d). 
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Figure 3 | Outflow tract tissue is sufficient to initiate and redirect epicardial 
regeneration. a, Top, a non-transgenic donor outflow tract (OFT) was 
transplanted to a transgenic ventricular base after epicardial ablation ex 
vivo. Bottom, base-to-apex epicardial regeneration (arrows) was observed 
from host tissue in 13 of 21 ventricles. dpt, days post-transplantation. 

b, Experimental design in ¢, d. c, tcf21:nuceGEP* epicardial cells 
transplanted to an epicardially ablated ventricular apex were static or 
regenerated (arrows) towards the apex (n = 12, all samples), but not 
towards the base. d, tcf2I:nuceGFP~ epicardial cells were transplanted to 
the apex of an epicardially ablated ventricle, followed by apical grafting of a 
donor bulbous arteriosus. tcf21:nuceGFP™ cells regenerated in a reversed 
apex-to-base direction (arrows) in 9 of 14 ventricles. Twelve of eighteen 
host ventricles with the host bulbous arteriosus removed before donor 
bulbous arteriosus grafting also showed apex-to-base regeneration. 

a, c, d, Red dashed lines delineate epicardial sheet edge; white dashed lines 
delineate ventricle and host bulbous arteriosus. a, d, Yellow dashed lines 
indicate donor bulbous arteriosus. Scale bars, 50 fim. 


Smo is an effector for several Hh family ligands, which have potent 
short-range effects in multiple contexts of embryonic development’®”". 
Quantitative polymerase chain reaction (qPCR) revealed shha, ihhb and 
dhh ligand transcripts in adult atrium, ventricle and bulbous arteriosus, 
where in situ hybridization detected shha and dhh transcript signals in 
smooth muscle tissue (Extended Data Fig. 10a-d). Epicardial ablation 
injury boosted bulbous arteriosus and ventricular shha levels, as well as 
levels of ptch1 and gli2a in purified epicardial cells (Fig. 4d, e). 
Moreover, a shha:eGFP reporter strain visualized shha regulatory- 
sequence-driven fluorescence in smooth muscle and epicardial tissues 
of the bulbous arteriosus (Extended Data Fig. 10e). No additional in situ 
hybridization or shha:eGFP fluorescence patterns were detectable after 
epicardial ablation; however, apical resection injury induced fluor- 
escence in ventricular epicardial tissue by 2 dpa (Extended Data 
Fig. 10d, f). To test whether local Hh ligand delivery is sufficient to 
substitute for the bulbous arteriosus, we removed atrium and bulbous 
arteriosus from cardiac explants, ablated epicardial cells, and applied 
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Bottom, CyA arrested regeneration while present 
(n = 7; all samples), and regeneration initiated 
(arrows) after CyA removal (6 of 7 ventricles). 

b, tcf21:NTR; tcf21:nuceGFP adults were incubated 
with Mtz and randomly separated into two groups 
for treatment with CyA or vehicle. c, Quantified 
tcf21:nuceGFP™ epicardial cells from experiments 
in b. *P < 0.05, Student’s two-tailed t-test; n = 13 
(vehicle) and 12 (CyA). d, Quantitative RT-PCR 
detecting shha and ihhb expression in outflow tract 
(OFT) after epicardial ablation (3 dpi), from 3 
separate experiments on pooled tissues using 90 
total zebrafish. e, RT-qPCR of fluorescence- 
activated cell sorting (FACS)-purified ventricular 
epicardial cells showed increased ptch1 and gli2a 
expression after epicardial ablation (7 dpi), from 2 
separate experiments using 209 zebrafish. f, Top, 
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beads soaked with Shh protein to the exposed ventricular base. Shh- 
soaked beads stimulated epicardial regeneration (one-half or greater 
coverage) in 9 of 32 ventricles, whereas this level of recovery never 
occurred after transplantation of BSA-soaked beads (0 of 27 ventricles; 
Fig. 4f). We speculate that these effects of Hh on the epicardial sheet 
might involve cytoplasmic extensions or a factor transport system**”’. 
Together, our findings support a model in which Hh ligand from the 
outflow tract, and possibly additional tissues, guides the base-to-apex 
regeneration of ventricular epicardium. 

In conclusion, we have identified a requirement for the mesothelial 
covering of the zebrafish heart for proficient muscle regeneration. 
Moreover, we show that the ventricular epicardium itself has high 
endogenous renewal capacity, vigorously regenerating as a sheet from 
the base of the chamber to its apex after genetic depletion. Our results 
point to the outflow tract as an unexpected signalling centre and source 
of Hh, and possibly additional influences, that can promote epicardial 
regeneration. It is likely that tissue regeneration is similarly regulated 
in trans in other contexts; for example, to maintain the mesothelium 
that lines abdominal organs. As a mediator of epicardial regeneration, 
Hh signalling can be integrated into new strategies to modulate repair 
of the damaged heart. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Zebrafish maintenance and procedures. Adult zebrafish of the Ekkwill and 
Ekkwill/AB strains were maintained as described and resection injuries were 
performed as described'**. Animals between 4 and 12 months of both sexes were 
used. Transgenic lines used in this study were Tg(tef21:mCherry-NTR}“% 
(tcf21:NTR, described later), Tg(tcf21:nuceGFP)? 441 (ref, 24), Tg(tcf21:CreER)? ae 
(ref. 2), Tg(gata5:loxp-mCherry-loxp-nuceGFP, 440 (ref. 25), Te(flila:eGFP)” 7 (ref. 
26) and T9(: shha:eGEP)?? (ref. 27). All transgenic strains were analysed as hemi- 
zygotes. For epicardial ablation experiments in adults, animals were bathed for 24h 
in 10 mM Mtz (Sigma) as described and returned to water’’. If ablation was 
performed after ventricular resection, we used a protocol of daily changes of 1 
mM Mtz solution for 3 days that had similar ablation effects (Extended Data 
Fig. 3c). This period corresponds with the early epicardial proliferative response 
to resection injury’ (data not shown), and was intended to extend the ablation 
window and improve animal survival. For larval epicardial ablation, 6 hours post- 
fertilization (hpf) embryos were bathed for 48 h in 10 mM Mtz before washout. For 
ex vivo epicardial ablation, dissected hearts were bathed for 24 h in 1 mM Mtz 
before washout. Cyclopamine (CyA; Selleckchem) was dissolved in ethanol to a 
final concentration of 20 mM. CyA was used at 10 tM for in vivo treatment of adult 
animals and 5 uM for ex vivo culture and embryo treatments. For EdU incorpora- 
tion experiments that followed epicardial ablation, animals were injected with 10 
mM EdU 4 h before collection. Experiments with uninjured animals used three 
daily 10 mM EdU injections. For lineage tracing, strains carrying tcf21:NTR; 
tcf21:CreER and gata5:loxp-mCherry-loxp-nuceGFP transgenes were placed in a 
small beaker of aquarium water containing 5 1.M tamoxifen. Fish were maintained 
for 24 h, rinsed with fresh aquarium water, and returned to a recirculating aquatic 
system for 24 h, before repeating this incubation twice. After 3 days of rinsing, Mtz 
was added for an additional 24 h. As is common when using Cre-based tools, we 
could not genetically label all epicardial cells in these experiments or rule out minor 
contributions by tcf21-negative cells. Animal procedures were performed in 
accordance with Duke University guidelines. 

Construction of tcf21:NTR zebrafish. The translational start codon of tcf21 in 
the bacterial artificial chromosome (BAC) clone DKEYP-79F12 was replaced with 
the mCherry-NTR cassette by Red/ET recombineering technology (Gene 
Bridges)'*. The 5’ and 3’ homologous arms for recombination were a 50-base pair 
(bp) fragment upstream and downstream of the start codon, and were included in 
PCR primers to flank the mCherry-NTR cassette. To avoid aberrant recombination 
between the mCherry-NTR cassette and the endogenous loxP site in the BAC 
vector, we replaced the vector-derived loxP site with an I-Scel site using the same 
technology. The final BAC was purified with Nucleobond BAC 100 kit (Clontech) 
and co-injected with I-Scel into one-cell-stage zebrafish embryos. 

Ex vivo cardiac explants. Adult hearts were rinsed several times in PBS after 
collection and cultured in dishes with DMEM medium (Life Technologies) plus 
2 mM t-glutamine (Life Technologies), 10% fetal bovine serum (HyClone, 
Thermo), 1% MEM non-essential amino acids (Life Technologies), 100 U ml 
penicillin, 100 pg ml~! streptomycin (Life Technologies) and 50 1M 2-mercap- 
toethanol (Life Technologies), while rotating at 150 r.p.m. Primocin (InvivoGen) 
was added to prohibit microbial contaminants during the first 3 days primary 
culture. For transplantation experiments, outflow tract or ventricular tissues were 
grafted by mounting in 1% low-melting-point agarose with ablated hearts in 
culture dishes, and covering with medium. After 2 days of culturing, attached 
tissues were released from the agarose; if transplanting an epicardial patch, the 
ventricular donor tissue was removed carefully using forceps. Fluorescent trans- 
genes in these cardiac explants were monitored using a Leica MZO5FA stereo- 
fluorescence microscope. 

Recombinant mouse Shh (C25II), N terminus protein (R & D Systems) was 
reconstituted at 100 1g ml~’ in PBS containing 0.1% BSA. Affi-Gel Blue beads 
(Bio-Rad) were prepared by thoroughly washing the beads in PBS, then incubating 
them in the Shh solution for 2 h at room temperature. A solution with the same 
concentration of BSA protein was used as the control. The beads were then applied 
to the base of ventricular explants that were settled in low-melting-point agarose in 
serum-free DMEM supplemented as described earlier. After 24 h, the ventricles 
were released from the agarose with the attached beads and cultured in supple- 
mented, serum-free DMEM. Under these ex vivo culture conditions we observed 
no increase in cardiomyocyte proliferation upon resection injury. 

Preparation of outflow tract and ventricular epicardial cells and RT-qPCR. 
Adult hearts were dissected from tcf21:NTR or tcf21:NTR; tcf21:nuceGFP animals 
3 and 7 days post-treatment with vehicle or Mtz. Outflow tracts were frozen in 
liquid nitrogen. Ventricular nuceGFP* epicardial cells were isolated as described 
previously** with modifications. Briefly, ventricles were collected on ice and 
washed several times to remove blood cells. Ventricles were digested in an 
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Eppendorf tube with 0.5 ml HBSS plus 0.13 U ml“ Liberase DH (Roche) and 
1% sheep serum at 37 °C, while stirring gently with a Spinbar magnetic stirring bar 
(Bel-Art Products). Supernatants were collected every 5 min and neutralized with 
sheep serum. Dissociated cells were spun down and sorted using a BD 
FACSVantage SE sorter for eGFP-positive cells. Total RNA was extracted using 
an RNAqueous-Micro isolation kit (Ambion) according to the manufacturer’s 
instructions. Reverse transcription was carried out using a SuperScript III First- 
Strand synthesis kit (Life Technologies). qPCR was carried out in triplicate using a 
Roche LightCycler 480 II system with the LightCycler 480 Probes Master Mix and 
the Universal Probe Library (UPL) (Roche). rpl13a served as the control. Primers 
and the UPL numbers used in this study were: shha-f, AAGCCCACATTCAT 
TGCTCT, and shha-r, CCTTCTGTCCTCCGTCCTG, UPL #54; shhb-f, GCAAG 
TATGGGATGCTATCCAG, and shhb-r, TCCTGATTTAGCAGCCACTG, UPL 
#16; ihha-f, T@GGTCTACTATGAGTCCAAAGC, and ihha-r, GGTTTTAGCA 
GCCACAGAGTG, UPL #86; ihhb-f, TCTTGTTATGCTGCGGTGAA, and ihhb-r, 
AAGCGTAGAGGTGCAAAAGC, UPL #140; dhh-f, CGTGCACTGCTCTGT 
CAAA, and dhh-r, AAACATGACATGGGCTTTTGT, UPL #156; ptch1-f, TG 
GCTTAAGGGCAGCTAATC, and ptch1-r, GCCGTGTGTACTTGAGTTCCT, 
UPL #87; ptch2-f, CCATGACATCAACTGGAATGA, and ptch2-r, GAATGCT 
CCCATGAACAACC, UPL #6; glil-f, CTGCAGCAAAGAGTTCGACA, and 
glil-r, CTCCGTGGATGTGCTCATTA, UPL #68; gli2a-f, CCTCACCCACC 
ACAGCAT, and gli2a-r, CGATCGGGATTGGTGTGT, UPL #39; gli2b-f, 
CCTGCCAGAATACTTCACATCA, and gli2b-r, CTCAACCTGGGCGTCA 
TAC, UPL #48; rpli3a-f, GCGGACCGATTCAATAAGG, and rpli3a-r, 
GAAAGACGACCGAGGAGATG, UPL #147. 

Histology. Analyses of cardiomyocyte proliferation were performed as previously 
described by counting Mef2* and PCNA® nuclei in wound sites'*. To quantify 
vascular endothelial cells in the wound site by 30 dpa with flila:eGFP or tcf21NTR; 
flila:eGFP animals, three medial, longitudinal sections were selected from each 
heart. Images of single optical slices of green fluorescence in the wound site were 
acquired using a X20 objective (1,024 X 1,024 pixels). eGFP* areas were quan- 
tified in pixels by Image] software, and the ratio of eGFP* area versus the length of 
the outlined apical wound was calculated for each heart. To quantify tcf21* epi- 
cardial cells in the wound site at 7 and 30 dpa with tcf21:nuceGFP or tcf21:NTR; 
tcf21:nuceGFP animals, three medial, longitudinal sections were selected from 
each heart. eGEP* cells were counted in the wound area, and the ratio of 
eGFP* cells versus the length of the outlined apical wound was calculated for 
each heart. Acid Fuchsin-Orange G and immunostaining were performed as 
described’*. Primary antibodies used in this study were anti-myosin heavy chain 
(MHC; F59, mouse; Developmental Studies Hybridoma Bank), anti-GFP (rabbit; 
Invitrogen), anti-Mef2 (rabbit; Santa Cruz Biotechnology), anti-MLCK (mouse, 
K36; Sigma) and anti-PCNA (mouse; Sigma). Secondary antibodies (Invitrogen) 
used in this study were: Alexa Fluor 488 goat anti-rabbit; Alexa Fluor 594 goat anti- 
rabbit, goat anti-rat and goat anti-mouse; and Alexa 633 goat anti-mouse. EdU was 
detected through a click reaction as described previously”’ with fluorescent azide 
(Alexa Fluor 594 or 647; Invitrogen). Whole-mounted and sectioned ventricular 
tissues were imaged used a Zeiss 700 confocal microscope. 

Data collection and statistics. Clutchmates (or hearts collected from clutch- 
mates) were randomized into different treatment groups for each experiment. 
No animal or sample was excluded from the analysis unless the animal died during 
the procedure. All experiments were performed with at least two biological repli- 
cates, using appropriate numbers of samples for each replicate. Sample sizes were 
chosen on the basis of previous publications and experiment types, and are indi- 
cated in each figure legend. For expression patterns, at least six fish were examined. 
For assessment of epicardial ablation and consequences on muscle regeneration, at 
least nine fish were examined. At least 12 hearts of each group were pooled for 
RNA purification and subsequent RT-qPCR. For ex vivo epicardial ablation 
experiments, at least six hearts were used for each treatment. An exception was 
the small compound screen, where at least four hearts were used for each drug. 
Quantification of cell proliferation and calculation of statistical outcomes were 
assessed by a person blinded to the treatments. All statistical values are displayed as 
mean + standard deviation. Sample sizes, statistical tests, and P values are indi- 
cated in the figures or the legends. Student’s t-tests (two-tailed) were applied when 
normality and equal variance tests were passed. The Mann-Whitney rank sum test 
was used when these failed. Fisher Irwin exact tests or chi-squared tests were used 
where appropriate. 
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Extended Data Figure 1 | Epicardial ablation and responses to resection 
injury. a, tcf21:nuceGFP or tcf21:NTR; tcf21:nuceGFP clutchmates underwent 
resection injury and were treated for 3 days with 1 mM Mtz, before collection of 
ventricles at 7 (n = 8 animals per group) and 30 dpa (nm = 11). tcf21:NTR; 
tcf21:nuceGFP wounds had fewer epicardial cells at 7 dpa and comparable 
occupancy by 30 dpa compared with tcf21:nuceGFP wounds. White dashed 
lines indicate wound edge. b, Quantification of eGFP* epicardial cells at 7 and 
30 dpa with respect to wound edge lengths from a. NS, not significant; Mann- 
Whitney rank sum test. ¢, tcf21:NTR clutchmates underwent resection injury 
and were treated for 3 days with 1 mM Mtz or vehicle with random separation 
into two groups for treatment. MLCK” cells, indicating myofibroblasts, had 
comparable presence in both groups at 14 dpa (n = 7 for each group). 
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d, tcf21:NTR; flila:eGFP clutchmates underwent resection injury and were 
treated for 3 days with 1 mM Mtz or vehicle with random separation into two 
groups for treatment. Epicardial ablation led to lower vascular density in 30 dpa 
wounds compared with controls. Red dashed lines indicate wound edge. 

e, Quantification of eGFP* vascular endothelial pixel area in wounds of 
tcf21:NTR; flila:eGFP zebrafish treated with Mtz (n = 12) or vehicle (n = 6), or 
of flila:eGFP zebrafish treated with Mtz (n = 6), with respect to the wound edge 
lengths. Student’s two-tailed t-test. f, By 60 dpa, 55 days after epicardial ablation 
protocols, muscularization (top) and wound collagen deposition (bottom) were 
grossly normal (n = 23). Brackets indicate area of regeneration. a, c, d, Yellow 
dashed lines indicate the approximate amputation plane. Scale bars, 50 jim. 
Error bars indicate s.d. 
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Extended Data Figure 2 | Epicardial cell proliferation without injury and areas in images of whole-mounted hearts show magnified views. a, b, Yellow 
after epicardial ablation. a, Limited epicardial cell proliferation on the arrows indicate representative eGFP* (green) EdU™ (red) nuclei. ¢, flila:eGFP 
ventricular surface. tcf21:nuceGFP fish were injected with 10 mM EdU once or tc{21:NTR; flila:eGFP fish were injected with 10 mM EdU at 3 days post-Mtz 
daily for 3 days and collected 1 day after the last injection. 10° ventricular treatment, and hearts were collected 4 h later. Red arrows indicate 
nuceGFP cells were assessed for EdU reactivity in 15 animals, from which 608 _ representative eGFP* (green) EdU* (magenta) endocardial cell nuclei; yellow 
cells were positive (a 0.6% rate for 4 days EdU labelling). Whole-mountimageis arrowheads, representative eGFP* (green) EdU (magenta) vascular 

shown, and arrows in enlarged boxed area indicate eGFP* EdU* nuclei. endothelial cell nuclei; white arrowheads, EdU* (magenta) nuclei within the 
b, tcf21:nuceGFP or tcf21:NTR; tc{21:nuceGFP fish were injected with 10 mM _ __ ventricular lumen, ostensibly erythrocyte nuclei. Scale bars, 50 jum. 

EdU at 3 days post-Mtz treatment, and hearts were collected 4 h later. Boxed 
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Extended Data Figure 3 | Mosaic NTR expression and patterns of spared 
epicardial cells after ablation. a, Whole-mounted examples of varied location/ 
pattern of spared epicardial cells in ventricles from tcf21:NTR; tcf21:nuceGFP 
adult clutchmates 3 days after incubation with 10 mM Mtz. White dashed lines 
delineate ventricle. b, Differential expression of the NTR transgene among 
cardiac chambers. In adult tcf21:NTR; tcf21:nuceGFP hearts, eGFP expression is 
comparable in epicardial tissue covering the atrium, outflow tract (OFT) and 


ventricle. By contrast, NTR (red, indicated by mCherry) expression is patchy 
and/or weak in the atrium and outflow tract compared with ventricular 
expression. c, Section images of ventricles from tcf21:nuceGFP (left) or 
tcf21:NTR; tcf21:nuceGFP zebrafish (right) treated with 1 mM Mtz (right) for 3 
days, and collected 2 days later. Ventricular epicardium was ablated effectively 
in these experiments. Scale bars, 50 jum. 
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Extended Data Figure 4 | Epicardial regeneration after ventricular 
resection. a, Hearts were removed from tcf21:NTR; tcf21:nuceGFP fish 
immediately after ventricular resection injuries, followed by 24 h of Mtz anda 
24 h washout ex vivo. A base-to-apex pattern of epicardial regeneration was 
observed, in this example covering the apical wound by 11 dpa (n = 18; 
behaviour seen in all samples). Epicardial coverage of resection injuries in these 
ablation experiments is delayed compared to ventricles recovering with an 


intact epicardium (b, top). Yellow boxed area, magnified view of the apical 
wound. Red dashed lines delineate ventricle. b, Hearts were removed from 
tcf21:nuceGFP clutchmates immediately after apical resection injury and 
cultured ex vivo, before random separation into two treatment groups. 
Epicardial cells covered the wound area by 3 dpa (n = 11; behaviour seen in all 
samples), unless treated with CyA (n = 26; failed coverage in 20 of 26 
ventricles). a, b, White dashed lines indicate apical wounds. Scale bars, 50 um. 
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Extended Data Figure 5 | Ex vivo grafts and epicardial regeneration. 

a, Schematic of the experimental design. b, c, Epicardial cells transplanted at the 
base of an epicardially ablated host regenerated towards the apex regardless of 
basal (b) (n = 25, behaviour seen in all samples) or apical (c) (n = 27, all 
samples) origin. d, Top, epicardial cells from the base of a transgenic donor 
ventricle were transplanted to the chamber midpoint of an epicardially ablated 
host ventricle and observed for regeneration. Bottom, transplanted cells 
eventually migrated towards the apex, not the base (n = 13; all samples). e, Top, 
after epicardial ablation, the host bulbous arteriosus was replaced with a non- 


transgenic donor ventricular apex and observed for regeneration. Bottom, 
ventricular epicardium showed little or no regeneration in these experiments (n 
= 7; behaviour seen in all samples). f, Left, after ex vivo epicardial ablation in a 
host tcf21:NTR ventricle, the host bulbous arteriosus was replaced with a donor 
tcf21:nuceGFP bulbous arteriosus. Right, the host ventricular surface contained 
different amounts of eGFP* nuclei in these ventricles (n = 3; behaviour seen in 
all samples). b-f, Red dashed lines indicate epicardium or epicardial leading 
edge; white dashed lines delineate ventricle. e, f, Yellow dashed lines indicate 
donor apex (e) or bulbous arteriosus (f). Scale bars, 50 jm. 
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Extended Data Figure 6 | Context-specific effects of outflow tract on 
epicardial regeneration. a, Top, after ex vivo epicardial ablation and bulbous 
arteriosus removal, ventricles were co-cultured with ten outflow tracts in a 
transwell assay and observed for regeneration. Bottom, no evidence for 
epicardial regeneration was observed in these experiments (n = 9; behaviour 
seen in all samples). b, Left, after ex vivo epicardial ablation and bulbous 
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arteriosus removal, a non-transgenic bulbous arteriosus (labelled as donor 
OFT) was transplanted to the apex and observed for regeneration. Right, no 
evidence for regeneration of eGFP~ epicardium from apex to base was observed 
in these experiments (n = 10; behaviour seen in all samples). Red dashed lines 
delineate epicardium; white dashed lines delineate ventricle; yellow dashed 
lines delineate donor outflow tract. Scale bars, 50 um. 
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Extended Data Figure 7 | Small-scale screen for compounds that inhibit 
epicardial regeneration. Ex vivo ablation and regeneration of tcf21:NTR; 


Biotechnology), 1 1M cyclosporin A (n = 4; Sigma-Aldrich), or 0.1 pg ml”! 
FK506 (n = 5; Sigma-Aldrich), in each case showing base-to-apex recovery 
tcf21:nuceGFP ventricles over 7 days. Mtz was added for 24h to freshly isolated (behaviour seen in all samples). The dissected hearts were randomly separated 
hearts, washed out, and compounds were added after 2 days (day 0). Hearts into groups for drug treatment. Red dashed lines indicate epicardial leading 
were treated with vehicle (n > 10), 10 4M DEAB (n = 5; Sigma-Aldrich), 100 edge; white dashed lines delineate ventricle. Scale bars, 50 um. 

nM LDN193189 (n = 4; Cayman Chemical), 10 4M SU5402 (n = 5; Santa Cruz 
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Extended Data Figure 8 | Epicardial proliferation is regulated by Hh 
signalling. a, Freshly dissected tc{21:nuceGFP hearts were randomly separated 
into two groups and cultured for 47 h with vehicle (n = 11) or 5 uM CyA (n = 
8). Then, 25 uM EdU was added to the medium for 1 h before collection at 48 h. 
CyA treatment decreases epicardial cell proliferation ex vivo. Arrows indicate 
representative eGFP~ (green) EdU* (red) nuclei. b, Quantification of 
eGEP* EdU* nuclei per mm’ on the ventricular surface, from hearts in a. 
**P < 0.01, Student’s two-tailed t-test. ¢, tcf21:nuceGFP adult fish were 
subjected to partial ventricular resection surgery, and randomly separated into 
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two groups for treatment with vehicle (n = 8) or 10 uM CyA (n = 10) from 2 to 
3 dpa. Then, 10 mM EdU was injected intraperitoneally 1 h before collection. 
CyA treatment decreases epicardial cell proliferation in vivo. Arrowheads 
indicate representative eGEP* (green) EdU* (red) nuclei. d, Quantification 
of eGEPEdU* nuclei per mm” on the ventricular surface, from hearts in c. 
**€P < 0.001; Mann-Whitney rank sum test. c, Yellow dashed lines indicate 
resection plane; white dashed lines delineate ventricle. a, c, Boxed areas, 
magnified views. Scale bars, 50 jim. Error bars indicate s.d. 
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Extended Data Figure 9 | Larval epicardial development and regeneration. 
a, tcf21:nuceGFP or tcf21:NTR; tcf21:nuceGFP larval clutchmates were treated 
with 10 mM Mtz from 6 hpfto 54 hpf, and then imaged at different times from 3 
to 5 dpf. tcf21:nuceGFP larvae show normal ventricular epicardial coverage at 3 
dpf, while tcf21:NTR; tcf21:nuceGFP coverage is sparse. tcf21:NTR; 
tcf21:nuceGFP larvae with confirmed full ablation were imaged from 3 to 5 dpf, 
covering first the ventricular base and then the apex. Three different extents of 
regeneration at 5 dpf are shown: class I, greater than two-thirds coverage; class 
II, one-third to two-thirds coverage; and class III, some cells but less than one- 
third coverage. b, A subset of tcf21:NTR; tcf21:nuceGFP larvae with confirmed 
full ablations were randomly separated and treated with vehicle or CyA, which 
limited regeneration in most cases (class IV, no ventricular epicardial cells). 
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c, Quantification of extents of regeneration from experiments in a and b. ***P 
< 0.001, chi-squared test; n = 54 embryos for vehicle, 51 for CyA. d, Epicardial 
morphogenesis visualized in tcf21:nuceGFP larvae. No epicardial cells are 
evident at or before 2 dpf. By 3 dpf, ventricles contained 17.6 6 epicardial cells 
on average (n = 23), whereas 4 dpf larvae contained 45.2 + 5.8 cells (n = 21). 
e, tcf21:nuceGFP larval clutchmates were randomly separated into two groups 
for treatment with vehicle or 5 uM CyA from 2 to 4 dpf. f, Quantification of 
ventricular eGFP™ epicardial cells from groups in e. ***P < 0.001, Student’s 
two-tailed t-test; n = 21 for each group. a, b, d, e, White dashed lines delineate 
ventricle. d, e, Boxed areas, magnified views. Scale bars, 50 tm. Error bars 
indicate s.d. 
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Extended Data Figure 10 | Hh ligand expression. a-c, Quantitative RT-PCR 
revealing shha, ihhb and dhh expression in atrium (a) or ventricle (b) in 
uninjured hearts and 3 days post-ablation, or in separated ventricular basal (the 
basal third of the chamber) and apical (the apical third) tissue after ablation 
(c). Three separate quantitative RT-PCR experiments on pooled tissues were 
performed, using a total of 90 zebrafish for experiments shown in a and b, and 
another 90 fish for c. shhb and ihha were not detected in these tissues. d, In situ 
hybridization (ISH) for shha or dhh in wild-type (WT) or tcf21:NTR clutchmate 
hearts at 3 days after Mtz treatment, indicating expression in outflow tract but 
not ventricle or atrium. Outflow tract of uninjured and epicardially ablated 
hearts showed comparable shha and dhh signals by ISH, a qualitative/ 
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semiquantitative assay. e, Section of adult shha:eGFP heart, indicating 
fluorescence in outflow tract tissues. Smooth muscle cells (MLCK, red) and 
epicardial cells (outer layer) in outflow tract showed clear eGFP signals, while 
there is no obvious eGFP fluorescence in ventricle and atrium. Valve 
mesenchyme also displays eGFP fluorescence. Arrowheads indicate eGFP 
signals in smooth muscle cells and epicardium. f, Ventricular resection induces 
shha:eGFP fluorescence in the basal ventricular epicardium at 2 dpa. Arrows 
indicate ventricular epicardial fluorescence. d, e, White dashed line indicate 
outflow tract (d) or atrioventricular junction (e). d-f, Boxed areas, magnified 
views. Scale bars, 50 um. Error bars indicate s.d. 
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Spastin and ESCRT-III coordinate mitotic spindle 
disassembly and nuclear envelope sealing 
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At the onset of metazoan cell division the nuclear envelope breaks 
down to enable capture of chromosomes by the microtubule-con- 
taining spindle apparatus’. During anaphase, when chromosomes 
have separated, the nuclear envelope is reassembled around the 
forming daughter nuclei’’. How the nuclear envelope is sealed, 
and how this is coordinated with spindle disassembly, is largely 
unknown. Here we show that endosomal sorting complex required 
for transport (ESCRT)-III, previously found to promote mem- 
brane constriction and sealing during receptor sorting, virus bud- 
ding, cytokinesis and plasma membrane repair*°, is transiently 
recruited to the reassembling nuclear envelope during late ana- 
phase. ESCRT-III and its regulatory AAA (ATPase associated with 
diverse cellular activities) ATPase VPS4 are specifically recruited 
by the ESCRT-III-like protein CHMP7 to sites where the reforming 
nuclear envelope engulfs spindle microtubules. Subsequent asso- 
ciation of another ESCRT-III-like protein, IST1, directly recruits 
the AAA ATPase spastin to sever microtubules. Disrupting spastin 
function impairs spindle disassembly and results in extended local- 
ization of ESCRT-III at the nuclear envelope. Interference with 
ESCRT-III functions in anaphase is accompanied by delayed 
microtubule disassembly, compromised nuclear integrity and the 
appearance of DNA damage foci in subsequent interphase. We 
propose that ESCRT-III, VPS4 and spastin cooperate to coordinate 
nuclear envelope sealing and spindle disassembly at nuclear envel- 
ope-microtubule intersection sites during mitotic exit to ensure 
nuclear integrity and genome safeguarding, with a striking mech- 
anistic parallel to cytokinetic abscission’. 

Previous studies have shown that ESCRT-III is recruited to the 
membrane bridge connecting daughter cells during cytokinesis”®. 
Interestingly, cell lines stained with an antibody against the ESCRT- 
III subunit chromatin modifying protein (CHMP)4B showed a specific 
staining around chromatin discs during late anaphase (Fig. la and 
Extended Data Fig. 1a, c). Similar anaphase localization was observed 
in HeLa cells stably expressing haemagglutinin (HA)-tagged CHMP4B 
(Extended Data Fig. 1d). This recruitment appeared transient as it was 
not observed during early anaphase or telophase (Fig. 1a). Live-cell 
microscopy of HeLa cells stably expressing CHMP4B fused to 
enhanced green fluorescent protein (CHMP4B-eGFP)’ revealed 
CHMP4B-eGFP localization around chromatin discs between 6 and 
12 min after anaphase onset (Fig. 1b and Supplementary Video 1). 

As the main constituent, CHMP4B assembles with other ESCRT-III 
subunits into membrane-deforming helical filaments’®, and we found 
that ESCRT-III subunits CHMP2A, CHMP3 and CHMP4A were also 
recruited around chromatin discs, as were the ESCRT-III associated 
proteins CHMP1A, CHMP1B and IST1 (Extended Data Fig. le). Lack 
of co-localization with markers for early (EEA1-positive), multivesi- 
cular (Hrs-positive) or late (LAMP1-positive) endosomes around 
chromatin discs argues against an endosomal nature of this ESCRT- 
III localization’ (Extended Data Fig. 1f, g). 


CHMP4B recruitment occurs at a mitotic stage similar to reas- 
sembly of the nuclear lamina, and nuclear pore complex reformation" 
(Extended Data Fig. 1h, i); however, there was little co-localization 
with the nucleoporin Nup153 (Extended Data Fig. 1j). Since direct 
association of ESCRT-III with membranes is essential for its role in 
membrane constriction and scission’*, we compared CHMP4B 
dynamics with nuclear envelope (NE) reassembly. We found that 
CHMP4B-eGFP localized prominently to punctate sites where the 
NE is reassembled from mobilized endoplasmic reticulum’, and was 
largely absent from regions lacking apparent nuclear membranes 
(Fig. 1c, Extended Data Fig. 2a and Supplementary Video 2). As 
reported for CHMP1A'’, CHMP4B localization was nuclease resistant 
(Extended Data Fig. 2b), whereas mutations in the amino (N)-terminal 
helix (4DE) of CHMP4B, critical for membrane association”, largely 
abolished its recruitment around anaphase chromatin (Fig. 1d, 
Extended Data Fig. 2c and Supplementary Video 3). This suggests that 
CHMP4B localization does not require chromatin and rather indicates 
that ESCRT-III is recruited to discrete sites of the nuclear membrane 
during the NE reassembly process. 

In other ESCRT-dependent processes, CHMP4B recruitment 
depends on upstream regulators such as CEP55, ESCRT-0 (Hrs), 
ESCRT-I (TSG101), or the BRO-domain proteins ALIX, BROX and 
HD-PTP"*. However, these proteins showed no enrichment at the 
reforming NE, and depletion of these factors did not appreciably affect 
CHMP4B recruitment to the reforming NE (Extended Data Fig. 2d). 
In striking contrast, depletion of the poorly characterized ESCRT-III- 
like protein CHMP7"* effectively abolished CHMP4B localization to 
nuclear membranes without apparent effect on CHMP4B localization 
to the midbody (Fig. le, Extended Data Fig. 2e, f and Supplementary 
Video 4). Our data identify CHMP7 as a novel and essential recruiter 
of ESCRT-III during NE reformation. 

VPS4 is critical to ESCRT-III function, with its inactivation causing 
persistent localization of ESCRT-II to membrane necks and prevent- 
ing membrane sealing’. Like CHMP4B, eGFP-VPS4A transiently 
localized around chromatin discs during late anaphase (Extended 
Data Fig. 3a). Importantly, upon depletion of VPS4, CHMP3 or 
CHMP2A a dramatic increase in the residence time of CHMP4B foci 
was observed (Fig. 2a and Supplementary Video 5), indicative of 
defective ESCRT-III function and recycling. This specific phenotype 
(Extended Data Fig. 3b) was particularly striking upon CHMP2A 
knockdown, resulting in the persistence of anaphase CHMP4B foci 
long into following cell cycles (Supplementary Video 6). Taken 
together, this suggests that ESCRT-III and VPS4, which mediate cyto- 
kinetic membrane scission’, also cooperate during NE reassembly. 

Disruption of ESCRT-III function did not affect markers for general 
NE reassembly, nuclear pore complex accumulation or reestablish- 
ment of nuclear import (Fig. 2b and Extended Data Fig. 3c-e). As 
CHMP4B localized only to a limited number of foci during anaphase, 
we asked whether ESCRT-III functions at specific nuclear membrane 
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Figure 1 | ESCRT-III is transiently recruited around chromatin discs 
during nuclear envelope reformation. a, Confocal images of fixed HeLa cells 
in different phases of mitotic exit. Scale bars, 5 jum. Representative confocal 
images of at least five captures for each stage. b, Number of CHMP4B foci 
around chromatin discs in HeLa cells stably expressing CHMP4B-eGFP. 

n = 10 cells. Bars, mean with 95% confidence intervals. c, Deconvolved wide- 
field image of a living HeLa anaphase cell expressing CHMP4B-eGFP and 
mCherry-KDEL. Representative image from wide-field live cell imaging of at 
least ten cells. d, Number of CHMP4B foci appearing around anaphase 
chromatin discs in HeLa cells expressing wild-type (WT) CHMP4B-eGFP 

or CHMP4B-eGFP 4DE, depleted for endogenous CHMP4B. Whiskers 
illustrate the minima and maxima of samples. n = 18 cells for wild-type 
CHMP4B-eGEP; n = 25 cells for CHMP4B-eGFP 4DE. ***P < 0.0001 
derived from unpaired t-test. e, Number of CHMP4B foci appearing around 
anaphase chromatin discs in HeLa cells expressing CHMP4B-eGFP 
decreases dramatically upon CHMP7 knockdown. Whiskers illustrate the 
minima and maxima of samples. n = 35 cells for control short interfering 
RNA (siRNA); 1 = 19 cells for CHMP7 siRNA. ***P < 0.0001 derived 
from unpaired t-test. 


sites. Correlative light and electron microscopy (CLEM) analyses 
revealed overlap of CHMP4B foci with areas of the NE that are not 
fully closed and that could also contain electron dense contact sites 
between chromatin and microtubules (MTs) (Fig. 2c and Extended 
Data Figs 3f and 4, arrows). 

Structured illumination microscopy (SIM) highlighted a striking 
association of CHMP4B with MTs traversing holes in the reforming 
NE (Fig. 2d and Extended Data Fig. 5a), indicating that ESCRT-III 
functions at sites of intersection between nuclear membranes and 
MTs". Such intersections were found for polar MTs at the rim of 
the chromatin disc and kinetochore MT bundles at its core regions 
(Extended Data Fig. 5b, c). Live tracking of CHMP4B-eGFP in cells 
co-expressing the kinetochore marker mCherry-CENP-A showed 
CHMP4B foci transiently at the rim of chromatin discs; however, as 
anaphase progressed, CHMP4B foci appeared prominently at the 
core regions of the chromatin disc to sites juxtapositioned with 
CENP-A, before disappearing from the chromatin discs at the end 
of anaphase (Fig. 2e, Extended Data Fig. 5d and Supplementary 
Video 7). Since NE reassembly initiates from the rim of chromatin 
discs and subsequently envelopes the central part of the chromatin 
disc’’, our data suggest that ESCRT-III foci are formed progressively at 
NE-MT intersections as the nuclear membrane engulfs spindle attach- 
ments along the chromatin disc. The dependence on MTs is further 
supported by the fact that CHMP4B foci were readily observed during 
NE reassembly upon mitotic slippage in Taxol-treated cells, while they 
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Figure 2 | ESCRT-III and VPS4 cooperate at sites of NE and MT 
intersection during nuclear envelope reassembly. a, Residence time of 
CHMP4B-eGFP around chromatin discs following indicated siRNA 
treatments. n = 17 cells for control siRNA; n = 20 cells for CHMP3 siRNA; n = 
16 cells for CHMP2A; n = 12 cells for VPS4A+B siRNA. Bars, mean with 95% 
confidence intervals. ***P values (P < 0.0001) derived from unpaired t-test. 
b, Normalized fluorescence intensity of mCherry-KDEL at the NE in HeLa 
cells expressing CHMP4B-eGFP and mCherry-KDEL (a.u., arbitrary units). 
n= 8 DNA discs for control siRNA; 1 = 8 DNA discs for CHMP3 siRNA. Bars, 
mean and 95% confidence intervals. c, CLEM of an anaphase cell correlates 
CHMP4B-eGFP and mCherry-CENP-A foci (detected by light microscopy) 
with areas of unsealed NE (observed on EM sections). A confocal plane (z,,) with 
the respective electron micrographs is shown, including insets with increasing 
magnifications. Representative of 13 CHMP4B-eGFP and mCherry-CENP-A 
double-positive foci (Extended Data Fig. 4 and Methods). Label ‘s 42’ refers to 
section number 42 within the transmission electron microscopy serial sections. 
d, A z-stack from SIM (left) and Imaris surface three-dimensional renderings 
(right) of a fixed anaphase HeLa cell stained as indicated. SIM image represen- 
tative of at least ten captures. Imaris reconstruction was performed on four areas 
of intersection between CHMP4B, MTs and ER. e, Dynamic localization of 
ESCRT-III to kinetochore proximal regions in HeLa cells expressing CHMP4B-— 
eGFP and mCherry-CENP-A. Scale bar, 5 tum. Deconvolved wide-field images 
from live-cell imaging are representative of at least ten videos, where t = 0 
represents onset of CHMP4B recruitment. f, Number of CHMP4B-eGFP foci 
appearing around chromatin in cells where mitotic slippage was induced in Taxol 
or nocodazole (Noc). n = 73 cells for nocodazole; n = 69 cells for Taxol. 
Whiskers illustrate the 10th-90th percentiles of samples. ***P < 0.0001 derived 
from unpaired t-test. 


were largely absent from nocodazole-treated cells (Fig. 2f and 
Supplementary Video 8). 

Live-cell tracking of MTs using mCherry—o-tubulin showed a 
strong correlation between MT disassembly and CHMP4B-eGFP 
localization (Fig. 3a, b and Extended Data Fig. 6c). Importantly, knock- 
down of VPS4, CHMP3 or CHMP2A caused a specific delay in MT 
disassembly at CHMP4B-eGFP foci (Fig. 3c and Extended Data Fig. 
6a, b). Together with the negative correlation between local intensities 
of CHMP4B and «-tubulin along individual MTs (Extended Data Fig. 
6c), this suggested that ESCRT-III is dynamically required for com- 
plete MT severing. Moreover, CLEM experiments showed that 
CHMP3-depleted cells maintain a high number of gaps in the NE with 
MTs attached to the chromatin discs (Fig. 3d, e and Supplementary 
Video 9). The collective data raised the possibility that ESCRT-III 
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Figure 3 | ESCRT-III-dependent recruitment of spastin to the reforming 
NE mediates mitotic spindle disassembly. a, Tracking of CHMP4B 
association with a MT bundle over time. Deconvolved wide-field images 
representative of at least ten videos, with at least 30 events, were observed. 

b, Normalized fluorescence intensity of CHMP4B-eGFP and spindle MTs 
associated with CHMP4B foci. Normalized fluorescence intensity of mid- 
spindle MTs was used as control. n = 5. Bars, mean and s.e.m. c, Persistence of 
spindle MTs associated with CHMP4B foci. n = 23 MTs for control siRNA; n = 
26 MTs for CHMP3 siRNA; n = 18 MTs for CHMP2A siRNA; 1 = 22 MTs for 
VPS4A+B siRNA. Whiskers illustrate the 5th-95th percentiles of samples. 
***P < (0.0001 derived from unpaired t-test. d, CLEM and EM tomography of 
anaphase cells shows persistent MT attachments to chromatin discs with 
unsealed NE in CHMP3 knockdown cells. Nuc, nucleus. Representative of 
three experiments (n = 3 cells for control siRNA; n = 3 cells for CHMP3 
siRNA). e, Number of MT attachment points to chromatin discs scored from 
EM sections from d. n = 3 for control siRNA; n = 3 cells for CHMP3 siRNA. 
Bars, mean with s.d. *P < 0.05 derived from unpaired t-test. f, Confocal images 
of anaphase HeLa cells transiently transfected with GFP-spastin M87 


could coordinate membrane sealing with disassembly of MTs during 
mitotic exit. 

ESCRT-III is known to recruit the MT-severing AAA ATPase spas- 
tin during completion of cytokinesis””°. Confocal imaging, SIM ana- 
lysis and live-cell microscopy showed that both endoplasmic 
reticulum-associated spastin M1 and cytosolic spastin M87”' co-loca- 
lized with CHMP4B-eGFP on nuclear membranes during late ana- 
phase (Fig. 3fand Extended Data Fig. 7a—c), in association with mitotic 
spindle MT bundles (Extended Data Fig. 7c). Deletion analyses showed 
that the ESCRT-III-interacting MIT domain’? of spastin M87 is 
required for its nuclear membrane localization whereas the MT-binding 
domain (MBD) is dispensible, arguing for direct recruitment of spastin 
by ESCRT-II (Fig. 3f and Extended Data Fig. 7d, e). This notion was 
further supported by the finding that CHMP2A was essential for recruit- 
ment of spastin and its reported interactors, CHMP1B and IST1’*” 
(Fig. 3g and Extended Data Fig. 7f-h). Interestingly, knockdown of 
IST1, but not CHMP1B, abolished spastin enrichment around chro- 
matin discs (Fig. 3g and Extended Data Fig. 7f-h), indicating that IST1 
serves as spastin recruiter to nuclear membranes. 

To monitor whether spastin is required for the stability of MTs 
at ESCRT-III sites at the NE, we monitored MT persistence at 
CHMP4B-eGFP foci. As for CHMP2A knockdown, depletion of spas- 
tin resulted in a significant delay in MT disassembly following 
CHMP4B-eGFP enrichment (Fig. 3h and Extended Data Fig. 6b). 
Importantly, spastin depletion or overexpression of dominant- 
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constructs. Scale bars, 5 um. = 5 cells for spastin M87; n = 7 cells for AMBD; 
n = 15 cells for spastin AMIT; n = 3 cells for spastin AMBD AMIT. 

g, Recruitment of spastin around anaphase chromatin in HeLa cells stably 
expressing mCherry-spastin M87 and CHMP4B-eGFP transfected with 
indicated siRNAs. n = 46 cells for control siRNA; n = 14 cells for CHMP2A 
siRNA; 1 = 11 cells for CHMP1B siRNA; n = 34 cells for IST1 siRNA. Results 
for CHMP2A and IST1 depletion are significantly different from control and 
CHMP1B depletion, P < 0.001. h, Time of disappearance of spindle MTs 
associated with CHMP4B foci. n = 38 MTs for control siRNA; n = 67 MTs for 
spastin siRNA 1; n = 88 MTs for spastin siRNA 4. Whiskers illustrate the 5th— 
95th percentiles of samples. ***P < 0.001 derived from Dunnet’s multiple 
comparison test. i, Residence time of CHMP4B-eGFP localization around 
anaphase chromatin discs upon treatment with indicated siRNAs. n = 17 cells 
for control siRNA; n = 21 cells for spastin siRNA 1; m = 15 cells for spastin 
siRNA 2; n = 16 cells for spastin siRNA 3; n = 20 cells for spastin siRNA 4. Bars, 
mean with 95% confidence intervals. *P < 0.05, **P < 0.01, ***P < 0.001 
derived from Dunnet’s multiple comparison test. 


negative spastin M87**“”° prolonged the residence time of CHMP4B- 


eGFP around chromatin discs (Fig. 3i and Extended Data Fig. 8a, b), 
suggesting that ESCRT-III persistence at nuclear membrane foci is 
affected by the presence of MTs. Accordingly, CHMP4B-eGFP res- 
idence time almost doubled in the presence of Taxol (Extended Data 
Fig. 8c and Supplementary Video 10), arguing that ESCRT-III function 
at membrane foci cannot be completed until MTs penetrating the 
reassembling NE are removed. 

We assessed nucleocytoplasmic compartmentalization in cells co- 
expressing histone 2B (H2B) fused to FRB (FK506 binding protein 
(FKBP)-rapamycin binding domain of mTOR) and mCherry-tagged 
maltose-binding protein (MBP) fused to FKBP. Addition of rapalog 
induces heterodimerization of FRB- and FKBP-fusions and traps sol- 
uble bulky MBP-mCherry on chromatin as it diffuses into the nucleus. 
Importantly, CHMP2A knockdown resulted in a highly significant 
increase in nuclear trapping rates compared with control cells 
(Fig. 4a), indicating that nuclear integrity is compromised upon 
ESCRT-III dysfunction. 

Compromised NE integrity has been associated with DNA 
damage™*”*, and we found that knockdown of CHMP2A or VPS4 
resulted in significantly increased DNA damage (Fig. 4b, c and 
Extended Data Fig. 9a). Importantly, DNA damage cluster markers 
(y-H2AX and 53BP1) were closely associated with persistent 
CHMP4B-eGFP foci and enrichment of nuclear lamina markers 
(Fig. 4c) but not nuclear pore complexes (Extended Data Fig. 9b), 
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Figure 4 | ESCRT-III dysfunction compromises nuclear integrity and leads 
to DNA damage and cell cycle arrest. a, Nuclear/cytoplasmic (nuc/cyto) ratio 
of FKBP-MBP-mCherry in late cytokinetic HeLa cells expressing CHMP4B- 
eGFP, FKBP-MBP-mCherry and H2B-FRB. Nuclear/cytoplasm ratio of 
FKBP-MBP-mCherry was plotted over time. Red and green lines represent the 
linear regression for control and CHMP2A siRNA-treated cells respectively. 
n = 52 cells for control siRNA; n = 51 cells for CHMP2A siRNA. Bars, 
mean ands.e.m. The difference between the slopes is highly significant (***P < 
0.0001). b, Induction of y-H2AX foci in hTERT RPE1 cells. n = 3,414 cells for 
control siRNA; n = 2,320 cells for CHMP2A siRNA; n = 3,410 cells for 
VPS4A+B siRNA. Images collected on high-content ScanR microscope were 
scored for number of y-H2AX foci = 20 pixels. Percentage of population was 
plotted. Bars, mean and s.e.m. **P < 0.005 and ***P < 0.001 were derived 
from unpaired t-test. c, SIM of CHMP4B-eGFP HeLa cells depleted for 
CHMP2A show multiple structures where y-H2AX, CHMP4B and emerin are 
highly associated. Scale bar, 5 jum. Representative of n = 38 DNA damage 
structures, with high association of CHMP4B and DNA damage clusters 
(97.36%) and association of with nuclear lamina enrichments (100%) observed. 
d, Frequency of association of CHMP4B with 53BP1 or emerin within DNA 
damage structures induced by CHMP2A depletion in CHMP4B-eGFP HeLa 
cells. n = 300 cells for control siRNA; n = 358 cells for VPS4 siRNA; n = 361 
cells for CHMP2A siRNA; n = 300 cells for CEP55 siRNA; n = 325 cells for 
CHMP2A + CEP55 siRNA. Bars, mean and s.e.m. VPS4, CHMP2A and 
CMP2A + CEP55 KD were highly significantly different from control and 
CEP55 KD (***P < 0.0001, unpaired t-test). e, Immunoblot showing levels of 
cellular p21 levels in hTERT RPE1 upon treatment with indicated siRNAs. 
Immunoblot is representative of four independent experiments. Asterisk 
indicates non-specific immunoreactivity. 


suggestive of nuclear envelope perturbation. These clusters did not 
originate from untimely cytokinetic abscission in the presence of 
chromatin bridges, as impairment of abscission by CEP55 knockdown 
did not affect their formation (Fig. 4d and Extended Data Fig. 9c). The 
absence of CHMP4B from ionizing-radiation-induced DNA damage 
foci (Extended Data Fig. 9d) argues against a general role for ESCRT- 
III in DNA damage response and rather suggests that ESCRT-III 
dysfunction induces a condition permissive to DNA damage. 
Compromised genome integrity leads to p53-dependent cell cycle 
arrest, characterized by induction of the cyclin-dependent kinase 
inhibitor p21 (ref. 26). Indeed, elevation of p21 levels upon knockdown 
of CHMP2A or VPS4 (Fig. 4e) indicates the induction of cell cycle 
arrest and underscores the physiological relevance for ESCRT-III 
function during anaphase in maintaining cell fitness. 
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Collectively, our data reveal a novel role for ESCRT-III and spastin 
in coordination of NE reassembly and spindle disassembly during 
late anaphase. We propose that during NE reassembly, nuclear 
membranes encounter MTs associated with chromatin discs, preclud- 
ing membrane closure (Extended Data Fig. 9e). To seal these holes, 
ESCRT-III is recruited by CHMP7 to sites of NE-MT intersection, 
with IST1 subsequently recruiting spastin to sever MTs. In concert 
with VPS4, ESCRT-III mediates constriction and sealing of the holes in 
the NE. Impairment of such sealing compromises nuclear integrity and 
nucleo-cytoplasmic compartmentalization, ultimately culminating in 
DNA damage”’. Together with the established function for ESCRT-III 
during cytokinetic abscission checkpoint signalling in the presence of 
pathological chromatin bridges**°, our results place ESCRT-III at 
centre stage for safeguarding the genome through mitotic exit. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Antibodies. Rabbit anti-human CHMP4B antibody (western blot 1:2,000, 
immunofluorescence 1:1,000) and rabbit anti-CHMP3 (western blot 1:2,000, 
immunofluorescence 1:1,000) were produced as previously decribed****. Rabbit 
anti-CHMP2A (Proteintech; western blot 1:500, immunofluorescence 1:100), rab- 
bit anti-CHMP7 (Atlas Antibodies; Sigma-Aldrich; western blot 1:250), rabbit 
anti-CHMP1B (Fitzgerald; western blot 1:500), rabbit anti-IST1 (Proteintech; 
western blot 1:500, immunofluorescence 1:100), mouse anti-spastin (Santa Cruz 
Biotechnology sc-53443; western blot 1:400, immunofluorescence 1:50), rabbit 
anti-VPS4 (Sigma-Aldrich; western blot 1:500, immunofluorescence 1:200), rabbit 
anti-p21 (Santa Cruz Biotechnology H-164; western blot 1:700), mouse anti-y- 
H2AX (Millipore 05-636; immunofluorescence 1:100), rabbit anti-53BP1 (Santa 
Cruz Biotechnology sc-22760; immunofluorescence 1:200), mouse anti-LAMP1 
(DSHB; immunofluorescence 1:200), mouse anti-emerin (NeoMarkers MS-1751; 
immunofluorescence 1:100), mouse anti-Lamin A (Abcam ab8980; immunofluor- 
escence 1:100), mouse anti-nuclear pore complex (pan NPC; Acris Antibodies; 
immunofluorescence 1:100), mouse anti-CEP55 (Abnova; western blot 1:300), 
mouse anti-o-tubulin (Sigma-Aldrich; western blot 1:10,000, immunofluores- 
cence 1:800), sheep anti-o-tubulin (Cytoskeleton; immunofluorescence 1:100), 
mouse anti-B-actin (Sigma-Aldrich; western blot 1:30,000), mouse anti-Nup153 
(Covance MMS-102P; immunofluorescence 1:200), rabbit anti-CHMP4A (Santa 
Cruz Biotechnology sc-67229; immunofluorescence 1:50), goat anti- V5 (Abcam; 
immunofluorescence 1:200), mouse anti-GFP (Living Colours; immunofluores- 
cence 1:100), Rabbit anti-Flag (Cell Signaling Technology 2368; immunofluores- 
cence 1:400), mouse anti-HA (Covance; immunofluorescence 1:100), goat anti- 
mCherry (Acris Antibodies; 1:200) were used as primary antibodies. Human anti- 
EEA1 serum* was a gift from B.-H. Toh. Rabbit anti-Hrs (immunofluorescence 
1:100) was described in ref. 34. RFP-booster and GFP-booster (Chromotek) were 
used 1:200 for immunofluorescence. Secondary antibodies included anti-mouse, 
anti-rabbit and anti-goat Alexa Fluor 488 (Jackson), Alexa Fluor 555 (Molecular 
Probes), Alexa Fluor 568 (Molecular Probes), Alexa Fluor 647 (Jackson) and Cy5 
(Jackson). 

Cell culture. Cell lines were cultured in Dulbecco’s modified Eagle’s medium 
(DMEM; Gibco) supplemented with 10% fetal bovine serum (FBS), 5 U ml! 
penicillin and 50 pg ml * streptomycin. For live-cell imaging, cells were seeded 
into Lab-Tek chambered coverslips (Nunc) or MatTek glass bottom microwell 
dishes (MatTek Corporation). For immunofluorescence studies, cells were grown 
on Precision cover glass (thickness 0.170 + 0.005 mm; Marienfeld). Samples were 
prepared on Precision cover glass and fixed with standard methods. Briefly, for 
SIM microscopy and o-tubulin stainings, microutubule integrity was preserved by 
fixing the cells with 4% EM grade formaldehyde (Polysciences) diluted in PEM 
buffer (80 mM K-Pipes, 5 mM EGTA, 1 mM MgCl, (pH 6.8)) for 5 min at 37°C. 
Cells were then permeabilized by treatment with PEM/0.15% Triton X-100 for 2 
min. For confocal microscopy, Methanol fixation was performed for 10 min at 
—20°C. Primary and secondary antibodies were diluted in PBS/0.05% Saponin 
and incubated for 2-3 h for SIM or 1 h for confocal imaging. After antibody 
staining, samples were mounted on microscope slides (Menzel-Glaser) with 
Mowiol for standard confocal imaging. Vectashield (Bioteam AS) or SlowFade 
Gold Antifade Reagent (Life Technologies) was used for SIM imaging. 
Transient plasmid transfections. For transfection of eGFP-spastin constructs, 
cells were grown on Precision cover glass in 24-well plates and transfected with 0.5 
ug pcDNA3.1-HAeGFP-spastin M87 (full length), pcDNA3.1-HAeGFP-spastin 
M87 AMBD (deletion 270-328 counted from M1), pcDNA3.1-HAeGFP-spastin 
M87 AMIT (deletion 116-194 counted from M1) or pcDNA3.1-HAeGFP-spastin 
M87 AMBD/AMIT (deletions 116-194 and 270-328 counted from M1) com- 
plexed with JetPEI (Polyplus) for 24 h. 

SiRNA treatments. All siRNAs were purchased from Ambion and contained the 
Silencer Select modification. Cells at 50% confluency where transfected using 
Lipofectamine RNAiMAX transfection reagent (Life Technologies) following 
the manufacturer’s instructions. Cells were transfected with 50 nM siRNA target- 
ing CHMP4B (CATCGAGTTCCAGCGGGAG), CHMP3 (GGAAGAAGCAGA 
AATGGAA), CHMP1B (ACATGGAAGTTGCGAGGAT), IST1 1 (CCAAG 
TATAGCAAGGAATA), IST1 2 (GCAAATACGCCTTTCTCAT), CHMP7 1 
(AGGTCTCTCCAGTCAATGA), CHMP7 2 (GCAATAGGCATTTTACCAA), 
CHMP7 3 (GGATGAAGTTTCTCAGACT), CHMP2A 1 (AGGCAGAGAT 
CATGGATAT), CHMP2A 2 (AAGATGAAGAGGAGAGTGA), spastin 1 (pre- 
designed: CAACCTTGCTAACCTTATA; siRNA s13348), spastin 2 (pre- 
designed: GGAAGTCCATTGACCCAAA; siRNA s13349), spastin 3 (CCAATA 
TAATCATAATGTA) and spastin 4 (AAAACAGACTTAAACAAAA), CEP55 
(GGAGAAGAATGCTTATCAA), Hrs (CGACAAGAACCCACACGTC), 
Tsgl01 (CCGTTTAGATCAAGAAGTA), Alix (GCAGTGAGGTTGTAAA 
TGT), HD-PTP (GCAAACAGCGGATGAGCAA), BROX (GCAAAAGAAG 
TTCATCGAA) for 48 or 72 h. For VPS4 knockdown expreriments, cells were 


transfected with 25 nM of siRNA targeting VPS4A (CCGAGAAGCT 
GAAGGATTA), plus 25 nM of siRNA targeting VPS4B (CCAAAGAAG 
CACTGAAAGA) for 24 h. Non-targeting control Silencer Select siRNA (pre- 
designed, catalogue number 4390844) was used as control. 

Confocal fluorescence microscopy. Fixed samples were imaged with a Zeiss LSM 
710 or 780 confocal microscope (Carl Zeiss Microlmaging) equipped with an Ar- 
Laser Multiline (458/488/514 nm), a DPSS-561 10 (561 nm), a Laser diode 405-30 
CW (405 nm) and a HeNe-laser (633 nm). The objective used was a Zeiss Plan- 
Apochromat X63/1.40 Oil DIC M27. Image processing was performed with basic 
software ZEN 2009 (Carl Zeiss Microlmaging) and ImageJ software (National 
Institutes of Health) which was used in CHMP4B depletion experiments 
(Extended Data Fig. 7e). The mean fluorescence intensity of spastin staining at 
anaphase chromatin discs was determined with the nuclear area as the defined 
region of interest (ROI). Here, intensity settings for the relevant channels were 
kept constant during imaging. Images shown in figures are representative of at 
least three independent experiments. For quantification of DNA damage struc- 
tures induced by CHMP2A and VPS4 depletion (Fig. 4c), fixed hTERT RPEI cells 
were immunolabelled for CHMP4B, y-H2AX and DNA and imaged. Structures 
that showed high association of y-H2AX with CHMP4B were manually scored. 
The frequency of association of CHMP4B with 53BP1 and emerin at DNA damage 
foci (Fig. 4c, explained in Methods) was manually scored on confocal images of 
CHMP2A-depleted hTERT RPE] cells. 

Live microscopy. Cells seeded in Lab-Tek four-well Chambered Coverglass 
(Nunc) were imaged on a DeltaVision microscope (Applied Precision) equipped 
with an Elite TruLight Illumination System, a CoolSNAP HQ2 camera and a X60 
Plan Apochromat (1.42 numerical aperture) lens. For temperature control during 
live observation, the microscope stage was kept at 37°C by a temperature-con- 
trolled incubation chamber. Time-lapse images (14 z-sections 1 jim apart for 
CHMP4B-eGFP HeLa cells; 6 z-sections 0.5 fm apart for mCherry-KDEL/ 
CHMP4B-eGFP HeLa cells, mCherry-NUP58/CHMP4B-GFP HeLa cells and 
IBB-eGFP/H2B-RFP transfected HeLa cells**) were acquired every 20 s from 
anaphase onset, deconvolved using softWoRx software (Applied Precision, GE 
Healthcare) and processed with ImageJ for presentation and quantifications. 
Short-term, high-resolution live-cell imaging was performed on a Deltavision 
OMX V4 microscope equipped with three watercooled PCO.edge sCMOS cam- 
eras, a solid-state light source and a laser-based autofocus. To allow deep imaging 
with minimal spherical aberration, a X60 1.3 numerical aperture silicon oil 
immersion lense (Olympus) was used. Cells were imaged in Live Cell Imaging 
buffer (Invitrogen) supplemented with 20 mM glucose. Environmental control 
was provided by a heated stage and an objective heater (20-20 Technologies). 
Images were deconvolved using softWoRx software and processed in Image]/ 
FIJI or Imaris. 

Super-resolution imaging using structured illumination. SIM imaging was 
performed on a Deltavision OMX V4 microscope equipped with three water- 
cooled PCO.edge sCMOS cameras, 405 nm, 488 nm, 568 nm and 642 nm laserlines 
and a X60 1.42 numerical aperture Plan Apochromat lense (Olympus). z-Stacks 
covering the whole cell, with sections spaced 0.125 jum apart, were recorded. For 
each z-section, 15 raw images (three rotations with five phases each) were 
acquired. Final super-resolution images were reconstructed using softWoRx soft- 
ware and processed in ImageJ/FIJI or Imaris. 

High-content microscopy. For DNA damage experiments (Fig. 4b), Olympus 
ScanR system (illumination system with an UPLSAPO X40 objective) was used 
for imaging of a large number of cells in fixed samples that were immunolabelled 
for y-H2AX, CHMP4B and DNA. Images were then processed with Image] for 
quantification. 

Taxol treatments. HeLa cells stably expressing CHMP4B-eGFP were imaged live 
on a DeltaVision microscope every 20 s from anaphase onset. DMSO or 2 uM 
Taxol in DMSO (Sigma-Aldrich) was added to cells in early anaphase (approxi- 
mately 140 s after anaphase onset) during imaging. 

Mitotic slippage assay. HeLa cells stably expressing CHMP4B-eGFP were syn- 
chronized in mitosis by incubating with 10 1M Taxol (Sigma-Aldrich) or 10 1M 
nocodazole (Calbiochem) for 2 h. Fields containing mitotic cells were selected ona 
DeltaVision microscope and mitotic slippage was induced by adding 10 .M of the 
CDK1 inhibitor RO-3306 (Tocris). After addition of CDK1 inhibitor, cells were 
imaged live every minute. 

DNase I treatment. Formaldehyde-fixed HeLa cells were subjected to DNase I 
(Sigma-Aldrich) treatment as previously described’. Cells were then immunola- 
belled for endogenous CHMP4B, «-tubulin and DNA. 

Image processing. Image processing used ImageJ/FIJI software’. For NE 
reformation assay and the nuclear pore deposition assay, mCherry-KDEL or 
mCherry-Nup58 fluorescence intensity was measured on a line manually drawn 
along the nucleus for each time point and normalized values plotted over time. For 
measuring residence time of CHMP4B-eGFP around chromatin, the presence of 
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CHMP4B foci around chromatin discs was manually scored over time and the 
total number of frames (seconds) positive for CHMP4B foci was plotted. For 
tracking of spindle MTs reaching CHMP4B foci (Fig. 3b), live HeLa cells stably 
co-expressing mCherry-o.-tubulin and CHMP4B-eGFP were imaged every 5 s 
during anaphase using a Deltavision OMX V4 microscope. Five-pixel ROIs were 
selected at CHMP4B foci intersecting with spindle MTs, and fluorescence intens- 
ities of mCherry-c-tubulin and CHMP4B-eGFP were measured for each time 
point. Fluorescence intensity values were then normalized to 100 and data sets 
were aligned in time according to CHMP4B arrival at MTs. For control MTs, ROIs 
were selected at the spindle midzone. For time of spindle MTs persistence in 
siRNA KD experiments (Fig. 3c, h and Extended Data Fig. 6b), cells were imaged 
every 30 s during anaphase using a Deltavision OMX V4 microscope. For scoring 
the time of disappearance of spindle MTs at CHMP4B foci, only clear mid-spindle 
MT-CHMP4B structures were manually analysed. The time from CHMP4B onset 
on each analysed MT to the disappearance of the same MT was scored. For 
representative videos, time-lapse series were de-bleached using the ‘correct bleach- 
ing’ function in FIJI. Bitplane Imaris software was used to generate surface render- 
ings of three-dimensional-SIM images. To track association of CHMP4B and 
CENP-A, time-lapse micrographs were segmented using the ‘Spots’ function in 
Imaris and analysed using the ‘Colocalize Spots’ function, with a distance treshold 
of 0.3 um. Nuclear import was assayed automatically using a customized ImageJ 
script. First, nuclei were segmented using the H2B channel of the images by 
median filtering and Otsu thresholding. The ImageJ function “Analyse Particles’ 
was then used to define ROIs. On the basis of these ROIs, the mean intensity of the 
nuclei was measured in the importin-B-binding domain of importin-. (IBB) 
channel. In parallel, a band-shaped ROI around the nuclei was used to measure 
the mean intensity of the cytoplasm as regularization factor. Nuclear import rates 
were determined by normalizing the mean intensity of the nuclear IBB signal to 
the overall IBB signal of the nucleus and cytoplasm. DNA damage was assessed by 
an ImageJ script that first generated ROIs by segmenting Hoechst-stained nuclei 
by Otsu thresholding and the Image] ‘Analyse Particles’ function. These ROIs were 
then used to segment and measure y-H2AX-positive foci within spots. DNA 
damage was scored by counting the number of cells with large (=20 pixels) 
DNA damage foci. Likewise, the mean intensity of these y-H2AX foci was deter- 
mined. To assess the number of CHMP4B foci in proximity to anaphase nuclei 
over time (Fig. 1b), nuclei were segmented using their H2B fluorescence signal by 
median filtering, Otsu thresholding and the “Analyse Particles’ function. Within 
these segmented nuclei, the number of CHMP4B foci was counted. To ease detec- 
tion of these foci, we performed a ‘Difference of Gaussian’ filtering and then used 
the ‘Find Maxima’ function of Image] to detect nuclei. Counting of these foci was 
performed on maximum intensity projected images to find foci from all imaging 
planes. For each cell, anaphase onset was recorded and the measurements were 
aligned to this time point. Analysis of CHMP4B foci in CHMP7 knockdown cells 
(Fig. le) was used a similar procedure. Again, nuclei were segmented and used as a 
mask for spot analysis. Owing to different imaging modalities, these images 
showed more noise, which was enhanced by the ‘Difference of Gaussian’ filtering 
and skewed the scoring of spots. To avoid this limitation, we first subtracted the 
cytoplasmic background (cytoplasm mean intensity + 2 s.d.) before maximum 
detection. In this case, only single time points and a single, central plane in mid- 
anaphase (1-2 min after the beginning of furrow ingression) were chosen for 
measurement. Localization of membrane-defective CHMP4B to anaphase nuclei 
(Fig. 1d) was scored in the same manner, but a maximum intensity projection of all 
imaging planes was used. Temporal alignment was based on cell and chromatin 
shape. Post-processing of automatically measured image data used Python and the 
‘Pandas’ data analysis package’’. Data points were plotted using Graphpad Prism. 
Tubulin line plots analysis. To analyse the effects of CHMPB presence on MT 
structure (Extended Data Fig. 6c), we measured o-tubulin intensity around 
CHMP4B spots using line plots. To this end, we chose MTs lying in a single plane 
and measured the fluorescence of both o-tubulin and CHMP4B in a two-pixel- 
wide line drawn along the MT. Measured values were transferred to Graphpad 
Prism for plotting and Spearman’s correlation analysis. 

Trapping of substrates by rapalog-mediated dimerization. The HeLa ‘Kyoto’ 
CHMP4B-eGFP BAC-tagged cell line was transduced with a lentivirus expressing 
H2B-FRB and FKBP-MBP-mCherry (which lacked a nuclear localization signal). 
These proteins were expressed as a single, T2A-cleaved open reading frame under 
control of the constitutive EFla promoter. Normally, we observed a diffuse local- 
ization of FKBP-MBP-mCherry in both the cytoplasm and nucleus. Addition of 
Rapalog caused trapping of the freely diffusible, but bulky FKBP-MBP-mCherry 
on chromatin. This was especially evident on metaphase chromatin, were we 
observed a rapid accumulation of mCherry at chromatin. Owing to the persistent 
permeability of the nucleus after anaphase”, we decided to analyse nuclei of cells in 
late cytokinesis. We reasoned that these cells recently went through anaphase and 
would be challenged with the effects of ESCRT knockdown. Cells in late cytokin- 
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esis were selected on the basis of the presence of CHMP4B at the midbody. Rapalog 
(iDimerize Heterodimerizer, Clontech) was added to a final concentration of 1 1M 
and cells were imaged for 30 min with 1 min time intervals. Nuclear leakage was 
assessed by measuring the fluorescence in the nucleus and cytoplasm within a 
manually placed circular ROI of 20-pixel diameter. The ratio of nuclear and 
cytoplasmic fluorescence was plotted and a linear regression line was fitted to 
the data points using Graphpad Prism software. 

Stable cell lines. A stable HeLa ‘Kyoto’ cell line expressing CHMP4B-eGFP was 
obtained from A. Hyman’. All other stable cell lines were lentivirus-generated 
pools. To achieve low expression levels, the weak PGK promoter was used for 
transgene expression. For higher expression levels, CMV or EF1o promoters were 
used. Third-generation lentivirus was generated using procedures and plasmids as 
previously described”. Briefly, (eGFP/mCherry/LSSmKate1/Flag/V5 fusions of) 
transgenes were generated as Gateway ENTRY plasmids using standard molecular 
biology techniques. From these vectors, lentiviral transfer vectors were generated 
by recombination into lentiviral destination vectors (Addgene plasmids 19067, 
19068, 41393; and vectors derived from pCDH-EFla-MCS-IRES-PURO 
(SystemBiosciences)) using Gateway LR reactions. VSV-G pseudotyped lentiviral 
particles were packaged using a third-generation packaging system*? (Addgene 
plasmids 12251, 12253, 12259). Cells were then transduced with low virus titres 
(multiplicity of infection = 1) and stable expressing populations were generated by 
antibiotic selection. Detailed cloning procedures are available from the authors. 
We used the following stable cell lines (listed here in order as background; new 
transgene (plasmids), respectively). 

HeLa ‘Kyoto’; HA-~eGFP-VPS4A (pLenti-PGK_HA-eGFP-VPS4A_Puro). 
HeLa ‘Kyoto’; HA-CHMP4B (pLenti-PGK_HA-CHMP4B_Bsd). 

HeLa ‘Kyoto’>} CHMP1A-V5 (pLenti-PGK_CHMP1A-V5_Puro). 

HeLa ‘Kyoto’>} CHMP1B-FLAG (pLenti-PGK_CHMP1B-FLAG_Puro). 

HeLa ‘Kyoto’; IST1-mCherry (pLenti-PGK_IST1-mCherry_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); mCherry-o-tubulin (pCDH-EFla_m 
Cherry-o-Tubulin_Bsd). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); H2B-mCherry (pCDH-EFla_H2B- 
mCherry_Bsd). 

HeLa ‘Kyoto? CHMP4B-eGFP (BAC); mCherry-KDEL (pCDH-EFla_mCherry- 
KDEL_Puro, derived from Addgene plasmid 36204). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); mCherry-CENP-A (pLenti-PGK_m 
Cherry-CENP-A_Bsd). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); mCherry-NUP58 (pLenti-PGK_m 
Cherry-NUP58_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP 
PGK_mCherry-spastin-M1_Puro). 
HeLa ‘Kyoto? CHMP4B-eGFP (BAC); mCherry-Spastin-M87 (pLenti-PGK_ 
mCherry-spastin-M87_Puro). 

HeLa ‘Kyoto? CHMP4B-eGFP (BAC); mCherry-spastin-M87"“"° (pCW57.1- 
TetON_mCherry-spastin-M87"*”2_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC) mCherry-CENP-A (LSSmKatel-c-tubulin; 
pCDH-EF1a_LSSmKate1-a-tubulin_Bsd, derived from Addgene plasmid 31902). 
HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); H2B-mCherry; CHMP2A wild type 
(pLenti-PGK_CHMP2Awt_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); H2B-mCherry; CHMP2A siRNAI resist- 
ant (pLenti-PGK_CHMP2Ares_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); mCherry-«-tubulin, CHMP2A wild type 
(pLenti-PGK_CHMP2Awt_Puro). 

HeLa ‘Kyoto’ CHMP4B-eGFP (BAC); mCherry- « -tubulin; CHMP2A siRNA1 
resistant (pLenti-PGK_CHMP2Ares_Puro). 

HeLa ‘Kyoto’; H2B-mCherry; CHMP4B-eGFP siRNA resistant (pLenti- 
PGK_CHMP4B-eGFP_Puro). 

HeLa ‘Kyoto’; H2B-mCherry; CHMP4B(4DE (V3D, F4D, L7E, F8D)-eGFP 
siRNA resistant (pLenti-PGK_CHMP4B(4DE)-eGFP_Puro). 

HeLa ‘Kyoto> CHMP4B-eGFP (BAC); H2B-FRB FKBP-MPB-mCherry 
(pCDH_EFla_H2B-FRB-2A-FKBP-MPB-mCherry). 

Protein blotting. Cells were lysed in 2X sample buffer (100 mM TrisHCl pH 
6.8, 4% SDS, 20% glycerol, 200 mM DTT, bromophenol blue). The whole-cell 
lysate was subjected to SDS-polyacrylamide gel electrophoresis on a 4-20% 
gradient gel. Proteins were transferred to Immobilon-P membrane 
(Millipore). Immunodetection was performed using Supersignal West Dura 
Extended Duration Substrate (Pierce) followed by conventional film exposure 
and developing (Amersham Hyperfilm ECL, GE), or fluorescently labelled sec- 
ondary antibodies and Odyssey developer. Western blots shown in figures are 
representative of at least three independent experiments. 

CLEM. For CLEM using confocal microscopy, cells were grown on gridded cover- 
slips (EMS) and fixed in 4% formaldehyde, 0.1% glutaraldehyde in 0.1 M PHEM 
and processed for EM as described below. For CLEM in combination with live-cell 


(BAC); mCherry-Spastin-M1  (pLenti- 
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imaging, HeLa CHMP4B-eGFP cells were grown on gridded glass-bottom dishes 
(MatTek) for 20 h and imaged every 20 s from the onset of anaphase. At the desired 
time point, cells were fixed in 2% glutaraldehyde in 0.1 M PHEM buffer. After 1 h 
the cells were washed in PHEM buffer and postfixed for 1 h with 2% OsO, and 
1.5% K4(Fe(CN).) in 0.1 M PHEM, followed by 0.5% tannic acid (30 min). After 
staining with 4% uranyl acetate in distilled H,O (30 min) the coverslips were 
dehydrated in graded series of ethanol and embedded in Epon. Serial, semi-thin 
sections (120-150 nm) through the whole nucleus were cut (Leica EM FCS ultra- 
microtome) parallel to the substrate and placed on carbon/formvar-coated slot- 
grids (EMS) and contrasted with 1% Pb-citrate. Sections were observed at 80-120 
kV on a JEOL JEM-1230 electron microscope and micrographs were recorded 
with a digital camera (Morada) using iTEM (SIS) software. Image processing was 
done with Adobe Photoshop CS2. First the image series were aligned in CS2, 
followed by serial reconstruction using three-dimensional modelling of the whole 
nucleus. This model was fitted onto a three-dimensional-reconstruction of con- 
focal sections (600 nm in thickness, 22 sections in total) from the same nucleus 
using Imaris. For simpler alignment we used the maxcentre function in Imaris to 
mark CHMP4B-eGFP- and mCherry-CENP-A-containing spots and to correlate 
these with the appropriate EM sections. 

Quantification of MT attachment points to chromatin discs. Cells treated with 
control and CHMP3 siRNA (three cells per condition) were fixed at the same time 
point after anaphase onset and samples were processed for EM as described above. 
Semi-thin serial sections covering the whole-cell volume were inspected and gaps 
in the nuclear envelope with three or more MTs attached were scored. 

Electron tomography. Semi-thin sections (160-200 nm) were cut parallel to the 
substrate and placed on carbon/formvar-coated slotgrids (EMS). Samples were 
observed on a FEI Tecnai microscope at 120 kV and image series were taken 
between —58° and 58° with 2° increments. Series were recorded with a FEI camera 


at 1.9 nm pixel size. Tomograms were computed using weighted back-projection 
and the IMOD package". 
Sample sizes and statistical analyses. All data sets presented in this paper derive 
from at least three independent experiments. No statistical methods were used to 
predetermine sample size. 
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Extended Data Figure 1 | ESCRT-III is transiently recruited around 
chromatin discs during nuclear envelope reformation. a, Confocal image of 
HeLa cells treated with control siRNA or CHMP4B targeting siRNA and 
stained as indicated. Scale bars, 10 jtm. Image is representative of at least five 
captures. b, Immunoblot of whole-cell lysates probed for endogenous 
CHMP4B and f-actin. c, Confocal images of H199v, hTERT RPE1 and U20S 
cells showing localization of endogenous CHMP4B around chromatin discs 
during late anaphase. Cells were co-stained for -tubulin and DNA (Hoechst). 
Scale bars, 10 xm. Images are representative of at least three captures for each 
cell line. d, Confocal image of fixed HeLa cells stably expressing HA-CHMP4B. 
Anti-HA in red and DNA (Hoechst) labelled in blue. Scale bar, 5 um. Image is 
representative of at least ten captures. e, Confocal images of late anaphase HeLa 
cells stained for endogenous proteins (CHMP4A, CHMP3, CHMP2? in red) or 
stably expressing tagged proteins (CHMP1B-Flag, CHMP1A-V5, IST1- 
mCherry in red) as indicated and DNA (Hoechst) shown in blue. Scale bars, 5 


jum. Images are representative of at least three captures. f, Confocal image of 
HeLa cells labelled for CHMP4B-eGFP, the early endosome marker EEA1, the 
multivesicular endosome marker Hrs and DNA (Hoechst). Image is 
representative of five captures. g, As f, but stained for the late endosome marker 
LAMP1 instead of Hrs. Image is representative of five captures. h, Confocal 
images of fixed HeLa cells in different phases of mitotic exit, stained for lamin 
A, endogenous CHMP4B, and DNA (Hoechst). Scale bars, 5 um. Images are 
representative of at least five captures for each stage. i, Confocal image of HeLa 
cells at different phases of mitotic exit stained as indicated. Note the different 
stage in nuclear pore reassembly in late anaphase (arrow) compared with 
telophase (arrowhead). Scale bar, 5 jtm. Image is representative of at least ten 
captures. j, SIM reconstruction of HeLa cells (labelled as indicated) shows no 
co-localization between CHMP4B and nuclear pores (Nup153). Image is 
representative of five captures. 
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Extended Data Figure 2 | ESCRT-III is transiently recruited around 
chromatin discs during nuclear envelope reformation. a, Deconvolved wide- 
field live-cell imaging of HeLa cells stably expressing CHMP4B-eGFP and 
mCherry-KDEL, showing recruitment of CHMP4B only at sites where the 
reforming NE has engulfed the chromatin disc (see gallery). Images are 
representative of at least ten videos. b, HeLa cells stably expressing CHMP4B- 
eGFP and mCherry-o.-tubulin were fixed and treated with increasing 
concentrations of DNase I. Cells were then labelled for GFP, mCherry and 
DNA and imaged with a confocal microscope. Scale bar, 5 tum. Images are 
representative of at least three captures for each condition. c, Immunoblot 
showing endogenous CHMP4B knockdown in HeLa cells stably expressing 
WT CHMP4B-eGFP or a membrane binding defective mutant CHMP4B- 
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eGFP 4DE. Asterisk indicates non-specific immunoreactivity. d, Table showing 
frequency of cells where CHMP4B was recruited around anaphase chromatin 
under the indicated siRNA transfections. CHMP4B is not recruited upon 
CHMP?7 depletion. This result is representative of knockdown experiments 
using three independent CHMP7-targeting siRNAs. e, Representative confocal 
images of HeLa cells transfected with the indicated siRNAs, fixed and 
immunolabelled for CHMP4B, «-tubulin and DNA. CHMP7 depletion affects 
CHMP4B recruitment around anaphase chromatin (upper panel), but does not 
affect CHMP4B recruitment at the midbody (lower panel). Scale bars, 5 um. 
Images are representative of 20 captures each for control and CHMP7 siRNAs. 
f, Immunoblot showing efficient endogenous CHMP7 knockdown in HeLa 
cells stably expressing CHMP4B-eGFP. 
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Extended Data Figure 3 | ESCRT-III and VPS4 cooperate at foci of NE-MT 
intersection during nuclear envelope reassembly. a, Deconvolved wide-field 
images of live HeLa cells stably expressing HAeGFP-VPS4A followed through 
mitotic exit, showing transient localization around chromatin discs during late 
anaphase. Scale bar, 5 jum. Gallery is representative of three videos. 

b, CHMP4B-eGFP/H2B-mCherry HeLa cells stably expressing untagged WT 
CHMP2<A or untagged RES CHMP2A (CHMP2<A allele resistant for CHMP2A 
siRNA 1) were transfected with the indicated siRNAs. Cells were then imaged 
live after anaphase onset and residence time (in minutes) of CHMP4B foci 
around chromatin was scored and plotted as a percentage of population. 
Immunoblot shows efficient depletion of CHMP2A in the non-resistant lines. 
n = 21 cells for WT/control siRNA; n = 24 cells for WT/CHMP2A siRNA 1; 
n = 11 cells WT/CHMP2A siRNA 2; n = 18 cells for RES/control siRNA; n = 
17 cells for RES/CHMP2A siRNA 1; n = 12 cells for RES/CHMP2A siRNA 2. 
***P = (0002 was derived from an unpaired t-test. c, HeLa cells stably 
expressing CHMP4B-eGFP and Nup58-mCherry were transfected with the 
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indicated siRNAs for 48 h and imaged every 20 s after anaphase onset. 
Normalized fluorescence intensity of Nup58-mCherry at NE was plotted over 
time. n = 8 DNA discs for control siRNA; n = 8 DNA discs for CHMP3 siRNA. 
Bars, mean and 95% confidence intervals. d, Immunoblot showing efficient 
endogenous CHMP3 knockdown in HeLa cells stably expressing CHMP4B- 
eGFP and Nup58-mCherry. e, HeLa cells stably expressing IBB—eGFP and 
H2B-RFP were transfected with the indicated siRNAs. Cells were then imaged 
every 20 s after anaphase onset. The mean intensity of the nuclear IBB was 
normalized to the total cellular IBB signal (nucleus + cytoplasm) and plotted 
over time. n = 37 cells for control siRNA; n = 32 cells for CHMP2A siRNA. 
Bars, mean and 95% confidence intervals. f, Electron tomography and three- 
dimensional reconstruction of kinetochore MTs (light blue) intersecting the 
reforming NE (magenta), obtained from correlative live-cell imaging (inset) 
and transmission EM. Nuc, nucleus. The electron tomography image is 
representative of EM analysis of three anaphase cells. 
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Extended Data Figure 4 | ESCRT-III and VPS4 cooperate at foci of the NE- 
MT intersection during nuclear envelope reassembly. CLEM of a whole 
nucleus from HeLa cells expressing CHMP4B-eGFP and mCherry-CENP-A 
by combination of a confocal stack and a complete serial EM reconstruction of 
the same structure. CLEM analysis is representative of three chromatin discs. 
a, Confocal stack; z;-Z9. Pink (z,,) and orange (z;3) highlighted confocal 
planes correspond to sections shown in Fig. 2c and in Extended Data Fig. 4f, 
respectively. b, The strongest CHMP4B-eGFP signals were annotated in an 
Imaris stack as centre of masses (green spheres in b; blue, DNA; pink and 
orange bars indicate sections shown in Fig. 2c) and then transferred manually 


onto a three-dimensional reconstruction from serial EM sections (n = 13 
CHMP4B-eGFP mCherry-CENP-A double-positive foci). c, Complete EM 
serial reconstruction. Bar index indicates corresponding EM sections. 

d, Immunofluorescence image corresponding to slice 11 through the three- 
dimensional model in b. e, Electron micrograph corresponding to slice 42 
through the three-dimensional model in c (nucleus, blue outline; green circles, 
approximate positions of centre of masses from b). f, A confocal plane (z;3) 
with the respective electron micrographs is shown, including insets with 
increasing magnifications. CHMP4B and CENP-A foci (detected by light 
microscopy) correlate with unsealed NE (observed on EM sections). 
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Extended Data Figure 5 | ESCRT-III and VPS4 cooperate at foci of the NE- 
MT intersection during nuclear envelope reassembly. a, Section from a SIM 
z-stack of a formaldehyde-fixed HeLa cell labelled as indicated. The Imaris 
surface three-dimensional renderings (three examples) illustrate holes in the 
NE with intersecting spindle MTs and CHMP4B localization within the NE 
holes. The SIM image is representative of at least ten captures. Imaris 
reconstruction was performed on four areas of intersection between CHMP4B, 
MTs and ER. b, Deconvolved wide-field images of fixed CHMP4B-eGFP HeLa 
cells labelled as indicated. Note CHMP4B foci localization in connection with 
polar MTs (left, middle and upper right panels) and to kinetochore MTs (lower 
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right panel). Images are representative of at least 20 videos. c, Deconvolved 
wide-field image from live HeLa cells stably expressing mCherry-CENP-A, 
CHMP4B-eGFP and LSSmKate1-o-tubulin, showing localization of CHMP4B 
on kinetochore MTs and in close proximity to kinetochores. Image is 
representative of five videos. d, HeLa cells stably expressing mCherry-CENP-A 
and CHMP4B-eGFP were imaged after anaphase onset. Number of CHMP4B 
foci not co-localizing or co-localizing with CENP-A, and total number of 
CENP-A foci, were quantified and plotted over time. Time = 0 equals onset of 
CHMP4B recruitment. n = 6 cells. Bars, mean and s.d. 
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Extended Data Figure 6 | ESCRT-III and VPS4 cooperate at foci of the NE- _ transfected with CHMP2A siRNA 1 and imaged every 30 s during mitotic exit. 
MT intersection during nuclear envelope reassembly. a, HeLa cells stably Spindle MTs reaching CHMP4B were tracked and their time of disappearance 
expressing CHMP4B-eGFP and mCherry-«-tubulin were transfected with the after CHMP4B onset was scored and plotted. n = 27 MTs for WT CHMP2A; 
indicated siRNAs and imaged every 30 s during mitotic exit. Microtubule n = 25 MTs for RES CHMP2A. Tukey whiskers extend from the largest value 
bundles contacting CHMP4B foci around chromatin discs were tracked and within 1.5 IQR (interquartile range) of the upper quartile to the smallest value 
shown in representative galleries. Time = 0 indicates CHMP4B enrichmenton __ within 1.5 IQR of the lower quartile. ***P < 0.0001 derived from unpaired 
the MT bundle. Scale bars, 5 jum. Images and galleries are representative of 23 _ t-test. c, Deconvolved wide-field images of CHMP4B-eGFP/mCherry-a- 
events for control siRNA; 26 events for CHMP3 siRNA; 18 events for CHMP2A tubulin HeLa cells where MT intensity around CHMP4B foci was measures 
siRNA; 22 events for VPS4A+B siRNA. b, CHMP4B-eGFP/mCherry-a- using line plots. Measured values were plotted and Spearman’s correlation 
tubulin HeLa cells stably expressing WT or siRNA 1 resistant CHMP2A were analysed. Scale bars, 5 um. 
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Extended Data Figure 7 | ESCRT-III-dependent recruitment of spastin to 
the reforming NE mediates mitotic spindle disassembly. a, Deconvolved 
wide-field image of live HeLa cells stably co-expressing mCherry-spastin M1 
with CHMP4B-eGFP, illustrating co-localization. Image is representative of at 
least five videos. b, As ina, but instead co-expressing the mCherry-spastin M87 
allele. Image is representative of at least five videos. c, A section from a SIM 
z-stack of a formaldehyde fixed HeLa cell labelled as indicated. The Imaris 
surface three-dimensional renderings (right) illustrate how CHMP4B and 
spastin embrace the spindle MT. SIM image is representative of five captures. 
d, ESCRT-III-dependent localization of spastin shown by confocal imaging of 
endogenous spastin (red), endogenous CHMP4B (green) and DNA (blue) after 
siRNA transfections as indicated. Scale bars, 5 um. Images are representative of 
30 cells for control siRNA and 30 cells for CHMP4B siRNA. e, Quantification of 
endogenous spastin fluorescence intensity around anaphase chromatin in cells 


treated and immunolabelled as in d. n = 30 cells for control siRNA; n = 30 cells 
for CHMP4B siRNA. Bars, mean and s.d. ***P < 0.0001 derived from 
unpaired t-test. f, Immmunoblots show efficient knockdown of endogenous 
CHMP1B (upper panel) and endogenous IST1 (lower panel). g, Deconvolved 
wide-field images of live anaphase HeLa cells stably co-expressing mCherry- 
spastin M87 with CHMP4B-eGFP, transfected with the indicated siRNA. Scale 
bars, 5 um. Images are representative of 46 cells for control siRNA; 14 cells for 
CHMP2A siRNA; 11 cells for CHMP1B siRNA; 31 cells for IST1 siRNA. 

h, Confocal images of HeLa cells stably co-expressing CHMP1B-Flag with 
CHMP4B-eGEFP, transfected with control or CHMP2A siRNA, fixed and 
stained for DNA, CHMP4B-eGFP and IST1 (left panel) or CHMP1B-Flag 
(CHMP1B-FL) (right panel). Scale bars, 5 um. Images are representative of at 
least five captures for each condition. 
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Extended Data Figure 8 | ESCRT-III-dependent recruitment of spastin to 
the reforming NE mediates mitotic spindle disassembly. a, Immunoblot of 
whole-cell lysates showing the efficiency of endogenous spastin depletion by the 
indicated siRNAs. b, Residence time of CHMP4B localization at anaphase 
chromatin discs is increased in HeLa cells expressing inducible spastin 
M87"° allele in addition to CHMP4B-eGEP. n = 17 cells for control (non- 
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induced); n = 19 cells for induced spastin M8772. Bars, mean with 95% 
confidence intervals. ***P < 0.0001 derived from unpaired t-test. c, HeLa cells 
stably expressing CHMP4B-eGFP were imaged every 20 s after anaphase onset. 
Taxol or DMSO was added in early anaphase. The percentage of cells with 
CHMP4B localized at chromatin discs was plotted over time. n = 9 cells for 
DMSO; n = 12 cells for Taxol. P < 0.0001 derived from unpaired t-test. 
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Extended Data Figure 9 | ESCRT-III dysfunction compromises nuclear 
integrity and leads to DNA damage and cell cycle arrest. a, hTERT RPE] cells 
were transfected with the indicated siRNAs. Cells were then fixed and labelled 
for y-H2AX and DNA. Images were collected on a high-content ScanR 
microscope and the mean fluorescence intensity of y-H2AX foci was measured 
and the percentage of the population plotted. n = 3,414 cells for control siRNA; 
n = 2,320 for CHMP2A siRNA; n = 3,410 for VPS4A+B siRNA. Tukey 
whiskers extend from the largest value within 1.5 IQR of the upper quartile to 
the smallest value within 1.5 IQR of the lower quartile. ***P < 0.0001 derived 
from unpaired t-test. b, Confocal image of HeLa cells depleted for CHMP2A 
fixed and labelled for nuclear pore complex (anti-pan NPC) and 53BP1. Scale 
bar, 5 um. Image is representative of six captures. c, Immunoblot showing 
efficient knockdown of CHMP24A, VPS4A, VPS4B and CEP55 in hTERT RPE1 


@ Spastin M87 & vPs4 


cells. Asterisk indicates non-specific immunoreactivity. d, Confocal images of 
hTERT RPE1 cells that were fixed 1 h or 5 h after exposure to ionizing radiation. 
Cells are labelled for DNA, y-H2AX and CHMP4B. Scale bar, 5 jum. Images are 
representative of five captures for control; seven captures for 2 Gy, 1 h; eight 
captures for 2 Gy, 5 h. e, Model for coordination of spindle disassembly and 
nuclear envelope sealing by ESCRT-III, VPS4 and spastin. During anaphase, 
ESCRT-III is recruited by CHMP7 to sites where the reforming NE engulfs 
spindle MTs. ESCRT-III recruits spastin M1 that is embedded in the engulfing 
NE as well as cytosolic spastin M87, which together promote MT disassembly. 
As such, ESCRT-II, VPS4 and spastin cooperate to coordinate progressive 
membrane constriction while severing MTs. Only when the surface of the 
reforming nucleus is cleared from spindle MTs can NE sealing occur. 
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ESCRT-III controls nuclear envelope reformation 


Yolanda Olmos’, Lorna Hodgson’, Judith Mantell??, Paul Verkade*?* & Jeremy G. Carlton! 


During telophase, the nuclear envelope (NE) reforms around daugh- 
ter nuclei to ensure proper segregation of nuclear and cytoplasmic 
contents’ *. NE reformation requires the coating of chromatin by 
membrane derived from the endoplasmic reticulum, and a sub- 
sequent annular fusion step to ensure that the formed envelope is 
sealed'”*°. How annular fusion is accomplished is unknown, but it 
is thought to involve the p97 AAA-ATPase complex and bears a 
topological equivalence to the membrane fusion event that occurs 
during the abscission phase of cytokinesis’*. Here we show that the 
endosomal sorting complex required for transport-III (ESCRT-III) 
machinery localizes to sites of annular fusion in the forming NE in 
human cells, and is necessary for proper post-mitotic nucleo- 
cytoplasmic compartmentalization. The ESCRT-III component 
charged multivesicular body protein 2A (CHMP2A) is directed to 
the forming NE through binding to CHMP4B, and provides an 
activity essential for NE reformation. Localization also requires the 
p97 complex member ubiquitin fusion and degradation 1 (UFD1). 
Our results describe a novel role for the ESCRT machinery in cell 
division and demonstrate a conservation of the machineries involved 
in topologically equivalent mitotic membrane remodelling events. 
The ESCRT-III complex performs a topologically unique membrane 
fusion, allowing the release of enveloped retroviruses during viral bud- 
ding, intraluminal vesicles during multivesicular body biogenesis, and 
daughter cells during the abscission phase of cytokinesis’"'’. We found 
that as well as localizing to the midbody during late cytokinesis, endo- 
genous ESCRT-III components CHMP2A and CHMP2B encircled the 
forming daughter nuclei during telophase (Fig. 1a, b and Extended Data 
Fig. 1a). CHMP2A localization was sensitive to CHMP2A-targeting 
short interfering RNA (siRNA; Extended Data Fig. 1b) and was not 
continuous; instead we found that CHMP2A adopted a transient 
punctate localization around the decondensing nuclei during telophase 
(Extended Data Fig. 1c and Supplementary Video 1). By scoring local- 
ization in HeLa cells stably expressing mCherry-tubulin (cell cycle of 
21.5 + 1.7 h (mean + s.d.), n = 93), we estimate the duration of 
CHMP2A localization to be 96 + 8.9 s. We found cells expressing 
green fluorescent protein (GFP)-tagged CHMP4B (ref. 12) also dis- 
played a transient, punctate, juxta-nuclear localization during telo- 
phase with recruitment of GFP-CHMP4B lasting 225 + 66 s (n = 8) 
and individual puncta lasting 75 + 46 s (n = 92; Extended Data Fig. 1d 
and Supplementary Video 2). Telophase ESCRT-III localization was 
observed in other cell lines, including human diploid fibroblasts 
(Extended Data Fig. le). Using HeLa cells stably expressing a yellow 
fluorescent protein (YFP)-tagged nuclear envelope marker (lamin 
associated protein 2B, YFP-LAP2)’*, we determined that the juxta- 
nuclear localization corresponded to the forming nuclear envelope. 
Here, we observed colocalization with the lamin B receptor (LBR)"* 
(Fig. 1c) and demonstrated that CHMP2A localization occurred before 
appreciable formation of a nuclear lamina or nuclear pore complexes 
(Extended Data Fig. 1f, g). While mitotic chromatin association of 
ESCRT-III has been previously reported’, its function remains 
unknown. To investigate the role of ESCRT components at the NE, 
we used siRNA to deplete these proteins’’. As described previously’, 


depletion of ESCRT components produced aberrant nuclei, and these 
defects phenocopied those produced by depletion of proteins required 
for NE reformation'* (Extended Data Fig. 1h). NE reformation is 
thought to be a two-phase process, separable into membrane fusion 
events that create an expanding reticular network with subsequent 
annular fusion of holes within this network to create a sealed barrier’. 
We next used correlative light-electron microscopy (Extended Data 
Fig. 2a-d) to examine telophase ESCRT-III NE localization. We 
found that at the stage of ESCRT-III recruitment, the NE had incom- 
pletely formed (Fig. 1d). Two populations of CHMP2A-positive mem- 
branes were found. First, isolated CHMP2A-decorated vesicles were 
observed in the cytoplasm, proximal to the forming NE (5.7 + 4.2% 
of total cellular gold, Extended Data Fig. 2e, i). Second, CHMP2A- 
decorated double-membrane sheets were observed to coat the chro- 
matin (51 + 1.7% of total cellular gold was within 100 nm of the NE). 
On these sheets, CHMP2A localized to discrete regions, with intact NE 
being devoid of label, but with CHMP2A preferentially (Extended Data 
Fig. 2h) decorating nucleo-cytoplasmic channels (mean diameter 
38.4 + 12.5 nm (+ s.em.), n = 2 from 17 determinations) between 
the forming double membranes of the NE (Fig. 1d, Extended Data 
Figs 2d—-g and 3a-d and Supplementary Videos 3 and 4). These channels 
must be resolved through annular fusion, and given the observed local- 
ization and topological equivalence with cytokinetic abscission (Fig. le), 
we speculated that ESCRT-III might be involved in this process. 

Requirements for CHMP2A localization to the telophase NE were 
revealed through depletion of partner ESCRT proteins, with CHMP4B 
and CHMP3, as for other ESCRT-dependent membrane remodelling 
events, having a major role in this recruitment (Fig. 2a). We used 
siRNA-resistant Flag-tagged CHMP2A expressed at near-endogenous 
levels to report localization in the presence of CHMP2A siRNA 
(Fig. 2b, c). Through introduction of mutations targeting known bind- 
ing partners, we found, as for midbody recruitment and cytokinetic 
abscission (Extended Data Fig. 4a, b), and consistent with the prev- 
iously determined telophase localization of GFP-CHMP4B (Extended 
Data Fig. 1d), that while CHMP2A*-Flag localized to the forming 
NE, disrupting interaction with CHMP4 proteins by mutation of 
Arg24, Arg27 and Arg31 to Ala (CHMP2A*-Flag(RRR/AAA))'® abol- 
ished this localization. Mutation of the amino-terminal CHMP2A og 
helix'’, or residues involved in the interaction with VPS4 (ref. 16) 
had no effect on NE localization (Fig. 2c and Extended Data Fig. 4a). 
These data indicate that CHMP2A is recruited to the forming NE 
through classical assembly of the ESCRT-III complex. 

The p97 AAA-ATPase controls both phases of NE reformation; 
together with its adaptor protein p47, it regulates membrane delivery 
and NE expansion, whereas through its adaptors nuclear protein local- 
ization 4 (NPL4) and UFD1 it regulates annular fusion’. Through 
NPL4 and UFD1, the p97 complex extracts ubiquitinated aurora-B, 
a chromosomal passenger complex component, from chromatin to 
allow chromatin decondensation and membranation*””’. Given our 
observed ESCRT-III localization (Fig. 1) and known interactions of 
ESCRT-III components with the chromosomal passenger complex’’, 
we screened the ESCRT machinery for interaction with the p97 


1Division of Cancer Studies, Section of Cell Biology and Imaging, King’s College London, London SE1 1UL, UK. *School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, 
Bristol BS8 1TD, UK. Wolfson Bioimaging Facility, University of Bristol, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK. “School of Physiology & Pharmacology, University of Bristol, Medical 


Sciences Building, University Walk, Bristol BS8 1TD, UK. 
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Figure 1 | ESCRT-III localizes to the forming nuclear envelope. a, HeLa cells 
stained with anti-tubulin, either anti-CHMP2A or anti-CHMP2B, and 4’,6- 
diamidino-2-phenylindole (DAPI). Scale bars, 10 jim. Images representative of 
three acquired images in each case. b, Quantification of juxta-nuclear 
CHMP2A localization during mitosis from a, quantification from 20 cells in 
interphase, prophase, pro-metaphase and metaphase, 23 cells in anaphase, 24 
cells in telophase, 36 cells in early cytokinesis and 20 cells in late cytokinesis. 
c, HeLa cells stained with DAPI, anti-CHMP2A or anti-CHMP2B, and either 
stably expressing YFP-LAP2f or stained with anti-LBR. Arrows indicate 
regions of colocalization. Scale bars, 10 jtm. Images representative of two (anti- 
CHMP2B and YFP-LAP2f, anti-LBR and anti-CHMP2A) or four (anti- 
CHMP2A and YFP-LAP28) acquired images. d, Tomographic slices of HeLa 
cells stained with fluoronanogold anti-CHMP2A. Correlation depicted in 
Extended Data Fig. 2a—c; arrow indicates nucleo-cytoplasmic channel. Scale 
bars, 200 nm. Images representative of 25 gold-decorated nucleo-cytoplasmic 
channels and quantified in Extended Data Fig. 2h. e, Schematic depicting 
topological equivalence between annular fusion of the NE and ESCRT- 
dependent membrane fusion events. MVB, multivesicular body. 


complex by yeast two-hybrid assay (Extended Data Fig. 5a—d). We 
found that CHMP2A bound specifically to UFD1 and confirmed this 
interaction by direct binding and co-precipitation assays (Fig. 3a, b 
and Extended Data Fig. 5e-h). We mapped the interaction with 
CHMP2A to the carboxy terminus of UFD1 (Extended Data Fig. 5f, 
g) and found that truncation of the C terminus of CHMP2A, or 
removal of the autoinhibitory helix («5), prevented interaction with 
UFD1 (Fig. 3a). We used siRNA targeting UFD1 (also known as 
UFDIL) (Extended Data Fig. 6a); although its partner protein, p97, 
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Figure 2 | Classical ESCRT interactions govern CHMP2A telophase NE 
localization. a, Immunofluorescence and quantification of NE localization in 
HeLa cells transfected with the indicated siRNA and stained with anti- 
CHMP2A, anti-tubulin and DAPI. Number of cells scored from 4 independent 
experiments: control, 58; CHMP4A, 59; CHMP4B, 55; CHMPAC, 53; CHMP3, 
64; CHMP1A, 47; CHMP1B 52; CHMP2B, 44. Data are mean + s.d. *P = 0.014, 
**P = 0.0004 (two-tailed Student’s t-test). Scale bars, 10 jum. Images 
representative of 47 (control), 26 (CHMP4B siRNA) and 42 (CHMP3 siRNA) 
cells. b, Western blotting of lysates from siRNA-depleted HeLa cells stably 
expressing mCherry-tubulin and the indicated CHMP2A*-Flag with anti- 
Flag, anti-CHMP2A or anti-GAPDH antisera. CT, control; kDa, kilodaltons. 
c, Immunofluorescence of CHMP2A*-Flag recruitment to the telophase NE in 
CHMP2A-depleted cells. Scale bars, 10 um. Quantification in Extended Data 
Fig. 4a; images representative of 30 cells (control), 26 cells (L4D/F5D), 30 cells 
(RRR/AAA) and 26 cells (L216D/L219D). 


was required for EGFR degradation”, we found cells depleted for 
UFD1 degraded EGFR normally (Extended Data Fig. 6b), allowed 
release of HIV-1-based lentivirus (Extended Data Fig. 6c), and com- 
pleted cytokinesis normally as previously reported”! (Extended Data 
Fig. 6d). However, while cells depleted for UFD1 recruited CHMP2A 
to the midbody (Fig. 3d), recruitment of CHMP2A to the forming NE 
was impaired (Fig. 3c, d). 

To examine mitotic roles for ESCRT-III in NE reformation, we 
imaged synchronized cultures of cells stably expressing both his- 
tone-2B-mCherry (H2B-mCh) and YFP-LAP2, and quantified the 
time taken to enclose the chromatin with YFP-LAP2f-positive NE. 
We were surprised to find that cells lacking ESCRT-II, but not UFD1, 
enclosed their chromatin faster than control cells (Extended Data Fig. 
7a-c). To explore the integrity of the nascent NE in CHMP2A- 
depleted cells, we followed a protocol similar to that recently 
described” and imaged synchronized cultures of HeLa cells stably 
expressing both H2B-mCh and GFP-tagged f-galactosidase (Gal) 
fused to the nuclear localization signal (NLS) from Simian virus 40 
(GFP-NLS-fGal)**. GFP-NLS-BGal is released from the nucleus 
after NE breakdown at mitotic onset, and returned after formation 
of transport-competent nuclear pores during NE reformation 
(Extended Data Fig. 8a, b). We found that the rate of GFP-NLS- 
BGal return to the nucleus was slower in ESCRT-III-depleted cells 
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Figure 3 | UFD1 directs NE-localization of CHMP2A. a, f-galactosidase 
activity of yeast co-transformed with the indicated Gal4- and VP16-fused 
proteins (n = 3). b, Microscale thermophoresis experiments displaying 
interaction of histidine-tagged UFD1 with CHMP2A (fraction unbound 
displayed, n = 5). c, d, Immunofluorescence (d) and quantification (c) of NE 


(Fig. 4a—c), despite the cells having enclosed their chromatin with NE 
membranes faster (Extended Data Fig. 7a). While nuclei were fre- 
quently malformed in ESCRT-III-depleted cells’? (Extended Data 
Fig. 1h), incorporation of nuclear pore complexes and import 
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Figure 4 | ESCRT-III depletion disrupts nuclear envelope integrity. 

a, Timelapse analysis of NE sealing in siRNA-transfected HeLa cells stably 
expressing H2B-mCh and GFP-NLS-[Gal. GFP signal presented according to 
pseudocolour scale at the indicated time points. Scale bars, 10 jim. A single 
image was pseudocoloured for demonstrative purposes. b, Quantification of 
NE sealing from siRNA-treated cells in a (cells were quantified at each time 
point; control, 140 cells from 7 independent experiments; CHMP2A-1, 98 cells 
from 5 independent experiments, P = 0.047; CHMP2A-2, 80 cells from 4 
independent experiments, P = 0.023; CHMP2A + CHMP2B, 60 cells from 3 
independent experiments, P = 0.006; CHMP3, 34 cells from 3 independent 
experiments, P = 0.002. All values are mean + s.e.m.; two-tailed Student’s t-test 
used to assess significance after 85 min). c, Western blotting of cell lysates from 
b with anti-CHMP2A, anti-CHMP2B, anti-CHMP3 or anti-GAPDH antisera. 
d, Z-slices extracted from a correlative tomographic reconstruction of the NE at 
60 min after anaphase onset from the indicated siRNA-transfected mCherry— 
tubulin HeLa cells. The numbered circles correspond to discontinuities labelled 
in the 3D reconstructions in Extended Data Fig. 10a. Scale bars, 200 nm. Images 
representative of 6 (control) and 12 (CHMP2A-1 siRNA) tomographic 
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machineries were normal (Extended Data Fig. 8c-e). However, in 
CHMP2A-, CHMP3- or UFD1-depleted cells, the post-mitotic 
nucleo-cytoplasmic partitioning of GFP-NLS-BGal was reduced 
(Fig. 4b, c and Extended Data Fig. 9a, b), indicating that NE integrity 
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reconstructions. e, The percentage of discontinuities smaller than 65 nm was 
scored. Discontinuities in this range that were not nuclear pore complexes 
(NPCs) as a percentage of total discontinuities (including NPCs) for n number 
of reconstructed tomograms: control, 9.4 + 3.0, n = 6; CHMP2A-1, 29.9 + 4.7, 
P=0.01,n = 12; CHMP2A-2, 28.3 + 2.0, P = 0.021, n = 2. The increase in the 
percentage of non-NPC discontinuities was assessed by two-tailed Student’s 
t-test (average diameter of non-NPC discontinuities was 38 + 22 nm 
(CHMP2A-1) and 58 + 19 nm (CHMP2A-2)). f, Western blotting of lysates 
from siRNA-treated HeLa cells stably expressing H2B-mCh, GFP-NLS-BGal 
and siRNA-resistant CHMP2A*-Flag with anti-CHMP24A, anti-Flag or anti- 
GAPDH antisera. g, Quantification of NE sealing from cells treated with 
siRNA as in f and imaged from 4 independent experiments (mean nucleo- 
cytoplasmic ratio given 85 min after anaphase onset + s.d., two-tailed Student’s 
t-test was used to assess significance across 4 independent experiments (*); 
control, 8.9 + 3.1,n = 174; CHMP2A siRNA, 5.4 + 2.6, n = 171, P = 0.0006; 
CHMP2A siRNA + CHMP2A®-Flag, 8.4 + 3.3, n = 132, not-significant; 
CHMP2A siRNA + CHMP2A*(RRR/AAA)-Flag, 5.4 + 2.2, n = 196, 

P = 0.0001). 
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was compromised by treatments that prevent ESCRT-III assembly at 
the NE. Results were confirmed with a second reporter (GFP-NLS) 
(Extended Data Fig. 9c) and we demonstrated that nuclear retention 
of this probe was defective in post-mitotic ESCRT-III-depleted cells 
(Extended Data Fig. 9d, e). Using correlative live-cell electron tomo- 
graphy, we found that CHMP2A depletion resulted in the persistence 
of unsealed holes in the post-mitotic NE (Fig. 4d, e and Extended Data 
Fig. 10a, b). Paralleling CHMP2 requirements in lentiviral release and 
cytokinetic abscission (Extended Data Figs 6c and 9f), depletion of 
CHMP2B had minimal effect on NE integrity (Extended Data 
Fig. 9a, b), while co-depletion of CHMP2A and CHMP2B disrupted 
NE integrity to a greater extent than CHMP2A depletion alone 
(Fig. 4b). NE integrity could be rescued by stable expression of 
siRNA-resistant CHMP2A-Flag (CHMP2A°-Flag), but, as with 
CHMP2A requirements in cytokinesis (Extended Data Fig. 4b) and 
HIV-1 release’’, not by expression of CHMP2A*-Flag(RRR/AAA) 
(Fig. 4f, g). We describe a novel localization and function of ESCRT- 
III in NE remodelling at sites of annular fusion, a process markedly 
similar to classical ESCRT-III-mediated membrane remodelling 
(Extended Data Fig. 10c). Localization is governed by classical 
ESCRT-III assembly mechanisms and also requires UFD1. An equi- 
valent ESCRT-III-dependent membrane remodelling at the NE may 
allow viruses or megaRNPs to traverse this membrane”, and in 
yeast, ESCRT-III has recently been shown to participate in surveillance 
and extraction of defective nucleoporins at the inner nuclear mem- 
brane”, indicating additional ESCRT-III activities on this membrane 
may exist throughout the cell cycle. ESCRT-III is thus involved in 
regulating the quality of the NE, and gene expansion within the 
ESCRT machinery may have resulted from an evolutionary drive to 
accommodate open mitoses. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Cell culture. HeLa and 293T cells were gifts from J. Martin-Serrano and were 
cultured in DMEM containing 10% FBS, penicillin (100 U ml‘) and streptomy- 
cin (0.1 mg ml !). GP2-293 cells were obtained from Clontech and were cultured 
similarly. BJ fibroblasts were obtained from the ATCC and cultured in 4:1 
DMEM:199 Media, supplemented with 15% FCS, penicillin (100 U ml~') and 
streptomycin (0.1 mg ml’). Stable cells lines were generated by transduction 
using MLV-based retroviruses as described previously’, and selected using 
Puromycin (200 ng ml~’), G418 (500 4g ml~') or hygromycin (200 pg ml~') 
as necessary. Cells were sorted to monoclonality by limiting dilution or FACS. 
HeLa cells stably expressing mCherry-tubulin” or GFP-CHMP4B (ref. 12) have 
been described previously and were gifts from J. Martin-Serrano. MycoSensor 
(Agilent) was used to screen for contamination. 

Plasmids. Plasmids encoding TSG101, EAP20 (also known as VPS25), EAP30 
(SNF8), EAP45 (VPS36), CHMP1A, CHMP1B, CHMP2A, CHMP2B, CHMP3, 
CHMP4A, CHMP4B, CHMP4C, CHMP5, VPS4A, LIP5 (VTA1), UBPY, CEP55, 
TAL, LAP2B (TMPO) and ALG2 were gifts from J. Martin-Serrano and have been 
described previously*””**°. Coding sequences for p97 (also known as VCP), p47 
(NSFL1C), NPL4 (NPLOC4), UFD1, CHMP7, VPS4B and SPARTIN (SPG20) were 
amplified from IMAGE clones (6502535, 3635947, 5017718, 3507963, 5551762, 
6042862 and 5313378), respectively, and were cloned into mammalian expression 
(pCR3.1-YFP) and yeast two-hybrid plasmids (pHB18 and pGBKT7). A plasmid 
encoding HD-PTP was a gift from P. Woodman and was cloned similarly. siRNA- 
resistant CHMP2A constructs were created by introducing Gln132Gln, 
Ala133Ala, Glu134Glu, Ile135Ile and Asp137Asp silent mutations in the 
CHMP2A coding sequence by PCR. 5’ EcoRI and 3’ NotI sites were added to 
facilitate cloning. Additional CHMP2A truncations, mutations and deletions were 
created by standard PCR procedures. UFD1 and its deletions were cloned with 5’ 
and 3’ NotI sites into relevant expression vectors. CHMP2A constructs were 
subcloned into pGBKT7 and UFD1 constructs were cloned into pHB18 for yeast 
two-hybrid analysis. For recombinant protein expression, CHMP2A was cloned 
into pGEX (GE Healthcare) and UFD1 constructs were cloned into pET28a 
(Novagen). For glutathione S-transferase (GST)-pulldown experiments, con- 
structs were cloned into pCAGGS-GST-EcoRI-NotI-Xhol (a gift from J. 
Martin-Serrano). A lentiviral expression vector (pLVXP) was a gift from M. 
Dodding and modified to express a GFP-EcoRI-NotI-Xhol polylinker by replacing 
its SnaBI/Xbal fragment with a GFP-EcoRI-Xhol-NotI-STOP-Xbal fragment, 
obtained by PCR from pCR3.1-GFP-EcoRI-XholI-NotlI (a gift from J. Martin- 
Serrano). SnaBI/Xhol fragments of pHM840 (encoding GFP-NLS-BGal and 
obtained from T. Stamminger via Addgene) were cloned into the SnaBI/Xhol sites 
of pLVXP-GFP-ENX to produce pLVXP-GFP-NLS-BGal. pLVXP-GFP-NLS was 
obtained by amplifying GFP using primers to incorporate a C-terminal SV40 NLS. 

A SnaBI/Notl fragment from pH2B-mCherry-IRES-Neo3 (a gift from U. 
Eggert) was subcloned into SnaBI/Not sites of phLHCX-MCS (a modified version 
of pLHCX containing a HindIII/Mlul/Sall/Xhol/NotI/Hpal/BamHI/Nsil/Clal 
MCS; a gift from T. Ng) to create pLHCX-H2B-mCh. LAP2B residues 244-454 
were amplified with in-frame NotI sites and cloned into the NotI site of 
pMSCVneo-YFP-EXN to create pMSCVneo-YFP-LAP2B. siRNA-resistant 
CHMP2A constructs were cloned with C-terminal Flag extensions into EcoRI 
and NotI sites of pCR3.1-EXN and subcloned into pNG72-ENX (gifts from 
J. Martin-Serrano). 

For retroviral transduction, above constructs within pMSCVneoYFP-EXN, 
pNG72 or pLHCX-MCS retroviral packaging vectors were transfected with 
pVSVG into GP2-293 cells (all from Clontech). Supernatants were collected, 
clarified by centrifugation (200g, 5 min), filtered (0.45 j1m) and used to infect 
target cells in the presence of 8 pg ml” polybrene (Millipore) at multiplicity of 
infection (MOI) < 1. For lentiviral transduction, 293T cells were transfected with 
pCMV8.91, pVSVG and with pLVXP-GFP-NLS-BGAL or pLVXP-GFP-NLS. 
Supernatants were collected, clarified by centrifugation (200g, 5 min), filtered 
(0.45 um) and used to infect target cells in the presence of 8 jig ml‘ polybrene 
(Millipore) at MOI < 1. In both cases, antibiotic selection was applied after 48 h. 
Antibodies. Antibodies against HSP90 (H114) were from Santa Cruz 
Biotechnology, TSG101 (T5701) was from Sigma, GAPDH (MAB374) was from 
Millipore, tubulin (DM1A) was from Sigma, CHMP2A (104771-AP) was from 
Proteintech, CHMP2B (ab33174) was from Abcam, CHMP4B (sc82556) was from 
Santa Cruz, CHMP3 (sc67228) was from Santa Cruz, UFD1 (106151-AP) was 
from Proteintech, anti-p24 Gag (183-H12-5C) was from the NIH AIDS Research 
and Reference Reagent Program, EGFR (2232) was from Cell Signaling 
Technology, GFP (7.1/13.1) was from Roche, LBR (SAB10400151) was from 
Sigma, Lamin A/C (MAB3538) was from Millipore, mAb414 was from 
Covance, DYKDDDDK-Tag (Flag) was from Cell Signaling Technology. Alexa- 
conjugated secondary antibodies were from Invitrogen and horseradish peroxi- 
dase (HRP)-conjugated secondary antibodies were from Millipore. 


SDS-PAGE and western blotting. Cell lysates were denatured in Laemmli buffer 
and resolved using SDS-PAGE gels. Resolved proteins were transferred onto 
nitrocellulose by western blotting and were probed with the indicated antisera 
in 5% milk. HRP-conjugated secondary antibodies were incubated with ECL 
Prime enhanced chemiluminescent substrate (GE Healthcare) and visualized by 
exposure to autoradiography film. 

Transient transfection of cDNA. HeLa cells were transfected using 
Lipofectamine-2000 (Life Technologies) according to the manufacturer’s instruc- 
tions. GP2-293 and 293T cells were transfected using linear 25-kDa polyethyleni- 
mine (PEI, Polysciences, Inc.). 

siRNA transfections. Cells were seeded at a density of 1 x 10° cells per ml (HeLa, 
BJ) or 2.6 X10” cells per ml (293T) and were transfected with siRNA at 100 nM, 2h 
after plating using Dharmafect-1 (Dharmacon). To minimize toxicity associated 
with CHMP2A and UFD1 depletion, single transfections were performed for 72 h. 
The following targeting sequences that have already been demonstrated to achieve 
potent and specific suppression of the targeted CHMP were used: control, 
Dharmacon non-targeting control D-001810-01. CHMP2A-1: AGGCAGAGAU 
CAUGGAUAUdTdT"; CHMP2A-2; AAGAUGAAGAGGAGAGUGAdTdT”; 
CHMP2B: UCGAGCAGCUUUAGAGAAAdTdT”; CHMP3: GGAAGAAGCA 
GAAAUGGAAdTdT"”; CHMP4A: Q-SI104268845 (ref. 13), CHMP4B: Q-SI0032 
5199 (ref. 13); CHMP4C: Q-SI04279674 (ref. 13); CHMP1A: CCAAGAAGG 
CGGAGAAGGAdTdT”; CHMP1B: UGGACAAAUUCGAGCACCAdTdT”; 
UFD1-1: GAGGCAGAUUCGUCGCUUU&dTdT; UFD1-2: MQ-017918-03-0002; 
UFD1-3: GUGGCCACCUACUCCAAAUdTdT”!; LEM4: GAGAAGACGCUGA 
GAAAUUdTdT™. UFD1-2 was excluded from much of the analysis owing to 
toxicity and morphological changes specific to this oligonucleotide. 

Yeast two-hybrid assays. Yeast Y190 cells were co-transformed with plasmids 
encoding the indicated proteins fused to the VP16 activation domain (pHB18) or 
the Gal4 DNA-binding domain (pGBKT7). Co-transformants were selected on 
SD-Leu-Trp agar for 3 days at 30 °C, collected, and LacZ activity was measured 
using a liquid B-galactosidase assay using chlorophenolred-B-p-galactopyrano- 
side (Roche) as a substrate. Average B-galactosidase activities presented. 
Lentiviral release. 293T cells were transfected with siRNA as described above, 
except that the second transfection contained additionally 300 ng of HIV-1 
pCMV4d8.91 (a gift from T. Ng), 100 ng of pLenti-SEW (a packaging vector 
encoding GFP, a gift from A. Ridley) and 100 ng pVSVG. After 48 h, virions were 
collected from 293T supernatants by filtration (0.45 jim) and centrifugation 
through 20% sucrose (21,000g, 120 min), lysed, resolved by SDS-PAGE and 
examined by western blotting. Additionally, HeLa cells were infected with 50 ul 
of viral supernatant and GFP-expression in these cells was measured by western 
blotting. Virion release was calculated by quantifying Gag’"°"/Gag""""™™ as deter- 
mined densitometry using ImageJ. 

Co-precipitation assays. 293T cells were co-transfected equal quantities of the 
indicated pCAGGS-GST construct and the relevant pCR3.1-YFP construct for 
48 h. Cells were collected and lysed in 1 ml of 50 mM Tris-HCl, pH 7.4, 150 mM 
NaCl, 5 mM EDTA, 5% glycerol, 1% Triton X-100, a protease inhibitor mixture 
(complete mini-EDTA-free, Roche). Clarified lysates were incubated with glu- 
tathione-Sepharose beads (Amersham Biosciences) for 3 h at 4 °C and washed three 
times with wash buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 5 mM EDTA, 5% 
glycerol, 0.1% Triton X-100). Bead-bound proteins were recovered in Laemmli 
sample buffer, resolved by SDS-PAGE and examined by western blotting. 
Production of recombinant proteins. BL21(DE3) Escherichia-coli-expressing 
plasmids encoding GST-tagged or His-tagged proteins were collected in bacterial 
lysis buffer (20 mM Hepes, pH 7.4, 500 mM NaCl, supplemented with complete 
mini, EDTA-free protease inhibitor (Roche) and 1 mM phenylmethylsulfonyl 
fluoride (PMSF); buffers for His-tagged protein purification were supplemented 
with 20 mM imidazole. Cells were lysed by addition of lysosyme (1 mg ml’, 15 
min), Triton X-100 (0.25%, 15 min) and were snap-frozen in liquid nitrogen. Cells 
were thawed on ice, clarified through addition of DNasel (20 jg ml’) and soluble 
proteins were collected by centrifugation at 28,000g for 30 min. Proteins were 
immobilised on Glutathione Sepharose 4B or Ni- NTA-sepharose resin, washed in 
wash buffer (20 mM Hepes, pH 7.4, 150 mM NaCl), containing 20 mM imidazole 
if required. Proteins were eluted from Glutathione Sepharose 48 resin in wash- 
buffer supplemented with 10 mM reduced glutathione (pH 8), or were eluted from 
Ni-NTA sepharose with a step gradient of imidazole. CHMP2A was cleaved from 
the GST using AcTEV protease (Life Technologies). Protein concentrations were 
measured using the Qubit assay system (Life Technlogies). 

Microscale thermophoresis. Measurements were performed using a Monolith 
NT.115 instrument (Nanotemper). In brief, recombinant CHMP2A was labelled 
with Alexa-647 using an NHS Amine-reactive labelling kit (Nanotemper), label- 
ling was verified by infrared imaging (Licor). 360 nM Alexa-647 CHMP2A was 
combined with serial dilutions of GST, His-UFD1 or His-UFD1(1-257) (max- 
imally 52.8 1M). Interactions were performed in 150 mM NaCl, 20 mM Hepes, 
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0.04% Tween-20, pH 7.4. Temperature jump and thermophoresis experiments 
were conducted using 100% LED illumination and 40% infrared laser power and 
were analysed using Nanotemper’s analysis suite. Binding curves could only be 
generated for the CHMP2A:UFD1 interaction, affinities were calculated by the 
software and averaged. 

Fixed cell imaging. Cells were imaged using Nikon Eclipse microscopes teamed 
with widefield (Ti-E) and confocal (A1R or Spinning Disc) imaging systems. 
Widefield image stacks were iteratively deconvolved using Autoquant. Images 
were processed in NIS Elements and exported to Photoshop for assembly. HeLa 
cells were fixed in methanol (for CHMP2A staining) or 4% paraformaldehyde 
(PFA) and subject to processing for immunofluorescence as described previously’. 
For multinucleation and midbody arrest assays, at least 300 cells per experiment 
were quantified. For telophase NE localization, between 10 and 20 telophase cells 
per experiment were scored. For fixed cell microscopical analysis, we scanned 
multiple coverslips and experiments before acquiring two to three representative 
images for presentation in figures. 

Live cell imaging. HeLa cells stably expressing the indicated proteins were plated 
in Stickyslides (Ibidi) adhered to a glass number 1 coverslip and transfected with 
the indicated siRNA. Cells were synchronised using a double thymidine block, and 
48 h after siRNA transfection (10.5 h after release from the second thymidine 
block), cells were transferred to a Nikon inverted spinning disc confocal micro- 
scope with attached environmental chamber and imaged live for 4 h using a 20x 
dry objective and a 1.5X magnification lens. For mitotic rim formation, three 
coordinates per condition were selected and frames were acquired every 1 min, 
rim formation was scored through manual analysis of individual frames. For 
nuclear accumulation of GFP-NLS and GFP-NLS-BGal, frames were acquired 
every 1 min. The ratio of background-corrected, area-normalized, GFP-positive 
pixel intensities within the cytoplasm and mCh-H2B demarcated nuclei at the 
indicated intervals were obtained using NIS-elements. We excluded 2 out of 98 
cells from CHMP2A-1 analysis, 2 out of 60 cells from UFD1-1 analysis and 2 out of 
60 cells from UFD1-3 analysis as these gave anomalous N/C (nuclear/cytoplasmic) 
ratios >10X s.d. from the mean. For imaging of GFP-CHMP4B recruitment to 
the telophase NE, cells were imaged using a 100X oil-immersion objective and 
confocal slices were acquired every 30 s using a spinning disc confocal microscope. 
For analysis of nuclear retention, siRNA-treated HeLa cells stably expressing 
GFP-NLS and mCh-H2B were imaged live using a Nikon A1R confocal micro- 
scope. Between 1 and 2 h after anaphase onset, cells were subject to photo-ablation 
of cytosolic GFP-NLS signal by point bleaching and the recovery of cytoplasmic 
fluorescence from the nuclear pool was quantified for 10 min after bleaching. 
Correlative light electron microscopy. Around 500,000 HeLa cells were seeded 
in a 3.5-cm Mattek gridded dish (P35G-2-14-C-GRID). The next morning, cells 
were fixed in phosphate buffer containing 1% PFA for 3 min. Cells were permea- 
bilized with 0.1% saponin in PHEM (60 mM PIPES, 25 mM HEPES, 10 mM 
EGTA, 2 mM MgCh, pH 6.9) and processed for immunofluorescence using 
anti-CHMP2A primary and goat anti-rabbit Alexa-594-conjugated fluoronano- 
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gold (Nanoprobes) and DAPI. Cells were subjected to a subsequent 10-min 3% 
PFA fixation and quenching before imaging on a Leica SP5 or SP8 confocal 
microscope. 

After fluorescent imaging, cells were postfixed in glutaraldehyde, subjected to 
silver enhancement (Aurion RGENT SE-EM), stained with OsO, and uranyl 
acetate, dehydrated through ethanol and embedded in Epon. Blocks were trimmed 
to the region identified by confocal imaging and 300-nm serial sections were cut 
using a diamond knife*’. For retracing of the cells of interest, sections were imaged 
on a FEI Tecnail2 and subsequently double tilt series of regions of interest were 
acquired on a FEI Tecnai20. Tilt series were reconstructed using iMOD and 
selected frames and movies were extracted using ImageJ. 

For quantification of holes remaining in the NE after CHMP2A depletion, 
tomograms were acquired by CLEM as described above. Discontinuities in the 
NE were scored as being NPC or non-NPC on the basis of cross-sectional mor- 
phology. Internal diameters of these discontinuities were measured from recon- 
structed tomograms using FIJI. Discontinuities were segregated by size and 
whether they were identifiable as NPCs or not. A threshold was set at 65 nm 
(>2 s.d. smaller than the measured control NPC diameter) and the percentage 
of discontinuities smaller than this was displayed. At least 50 discontinuities 
were analysed per treatment across multiple cells from the indicated number of 
tomograms. 

Statistical analysis. Variance was analysed using an F-test, and type-relevant two- 
tailed Student’s t-tests were used to assess significance between test samples and 
controls. No statistical methods were used to predetermine sample size. The 
experiments were not randomized, and the investigators were not blinded to 
allocation during experiments and outcome assessment. 

ImageStream analysis. siRNA-treated HeLa cells in 6-well dishes were 
detached, fixed in 4% PFA, permeabilized with 0.1% Triton X-100 and stained 
in suspension with mAb414, Alexa-594-conjugated secondary antibodies and 
DAPI (at 0.1 tg ml~'). In-focus, single-cellular populations were acquired and 
a mask was applied to the DAPI channel and duplicated then dilated by three 
pixels to encompass the mAb414 signal surrounding the nuclei. The difference in 
the mAb414 signal captured by these masks was given as the nuclear envelope 
mAb414 and presented as a histogram. Representative images of average mAb414 
intensity were extracted for presentation. 
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Extended Data Figure 1 | Localization of ESCRT components during the 
cell cycle. a, b, Immunofluorescence analysis of HeLa cells stained with anti- 
tubulin, anti-CHMP2A or anti-CHMP2B and DAPI (a). Images in a are 
representative of two acquired images per field of view. Cells in b were treated 
with control or CHMP2A-targeting siRNA; images representative of four 
(control) or two (CHMP2A siRNA) acquired images. c, Deconvolved 
projections of HeLa cells stained with anti-CHMP2A and DAPI, corresponding 
to stills from Supplementary Video 1. Images representative of two 
deconvolved image series. d, HeLa cells stably expressing GFP-CHMP4B were 
imaged live during the anaphase to telophase transition. Telophase frames at 
30-s intervals are presented, corresponding to stills from Supplementary Video 
2. Images representative of four acquisitions. e, Immunofluorescence analysis 
of human diploid fibroblasts stained with anti-CHMP2A, anti-tubulin and 
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DAPI, images representative of three acquired cells per cell cycle phase. 

f, g, Immunofluorescence analysis of HeLa cells stained with anti-CHMP2A, 
DAPI and either anti-mAb414 (f) or anti-LaminA/C (g), images representative 
of five acquired cells. Arrowheads indicate regions of formed nuclear pores or 
lamina as indicated. h, Quantification of abnormal nuclei (the presence of 
multiple lobes, micronuclei, lamina ingression or invagination) in HeLa cells 
transfected with the indicated siRNA and stained with anti-LaminA/C (1,300 
cells over 5 experiments quantified per treatment; data are mean + s.d.). Images 
representative of three (control, CHMP2A siRNA) or two (LEM4 siRNA) 
acquired fields of view and resolved cell lysates were examined by western 
blotting with anti-CHMP2A, anti-CHMP2B or anti-GAPDH antisera as 
indicated. Scale bars, 10 tm. 
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Extended Data Figure 2 | Correlative light and electron microscopy (CLEM) 
of endogenous CHMP2A localization in telophase NE. a-c, Phase-contrast 
(a), correlative immunofluorescence (b) and transmission electron microscopy 
of HeLa cells stained with anti-CHMP2A, detected by Alexa-594- 
fluoronanogold and DAPI. Boxed region in a is shown in b; boxed region in b is 
shown inc. In all cases, images representative of three cells prepared for CLEM. 
d, 3D rendering of tomographic reconstruction of forming NE from boxed 
region in c and Fig. 1d; a single example of a nucleo-cytoplasmic channel was 
selected for 3D rendering. e-g, Z-slices extracted from tomographic 
reconstructions of forming NE depicting CHMP2A localization to isolated 
vesicles (e, i) and nucleo-cytoplasmic channels (arrows in e, ii, f, g) at the 
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indicated Z-heights Localization of CHMP2A to nucleo-cytoplasmic channels 
was observed in three independent cells; data from a second cell are presented 
in Extended Data Fig. 3. Note CHMP2A localization to nucleo-cytoplasmic 
channels is distinct from nuclear pores (asterisk in f). h, Quantification of 
CHMP2A labelling from two independently prepared cells. Channels were 
defined as discontinuities up to 80 nm, and gaps were defined as discontinuities 
over 80 nm. Distances of the gold-particles from channels or gaps were 
measured on the tomograms in three-dimensions and plotted as a histogram. 
Most (74.4%) of the gold label was found within 150 nm of nucleo-cytoplasmic 
channels, and most (70.6%) of the gold label was found more than 150 nm from 
the larger gaps in the NE. Scale bars, 10 um (b) and 200 nm (f, g). 
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Extended Data Figure 3 | CLEM of endogenous CHMP2A localization in c depicting CHMP2A-localization to nucleo-cytoplasmic channels at the 


telophase NE. a-c, Phase-contrast (a), correlative immunofluorescence indicated Z-heights. Arrow indicates nucleo-cytoplasmic channel. Images in all 
(b) and transmission electron microscopy (c) ofa second HeLa cell stained with _ cases representative of 3 cells processed for CLEM, quantification of gold 
anti-CHMP2A, detected by Alexa-594-fluoronanogold and DAPI. Boxed localization given in Extended Data Fig. 2H. Scale bars, 24 1m (b), 1 pm 


region in a is shown in b; boxed region in b is shown in c. d, Z-slices extracted  (c) and 200 nm (d). 
from tomographic reconstruction of forming NE from boxed region in 
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Extended Data Figure 4 | Mitotic defects in cells reliant on mutated forms of 
CHMP2aA. a, Quantification of CHMP2A recruitment to the telophase NE or 
the midbody from Fig. 2c (n = 3, 10 cells (midbody or telophase) scored per 
experiment). b, Quantification of cytokinetic failure from cells treated with the 
indicated siRNA (300 cells were quantified per experiment, from three 
independent experiments). Data are mean = s.d. 
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Extended Data Figure 5 | Screening for ESCRT-p97 complex interactions. _ detailing binding of CHMP2A to GST (n = 4), His-UFD1 (n = 5) or His- 
a-d, B-galactosidase activity of yeast co-transformed with the indicated Gal4. | UFD1(1-257) (m = 4). As no reduction in thermophoresis signal was observed 


(ESCRT)- and VP16-fused proteins (n = 2). e, Resolved cell lysates and for GST or His-UFD1(1-257) across the concentration range, we present here 
glutathione-bound fractions from 293T cells transfected with the indicated the average thermophoresis signal change at equivalent protein concentrations 
fusion proteins were examined by western blotting with anti-GFP (n = 3). for these three proteins, normalized to zero at the concentration in capillary 1. 
f, B-galactosidase activity of yeast co-transformed with the indicated Gal4- and _h, Alexa-647-labelled CHMP2A, His-UFD1 and His-UFD1(1-257) were 
VP16-fused proteins (n = 3). g, Microscale thermophoresis experiments examined by infrared imaging or Coomassie staining. Data are mean + s.d. 
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Extended Data Figure 6 | UFD1 depletion does not affect ESCRT- 
dependent receptor degradation, lentivirus release or cytokinetic abscission. 
a, Resolved cell lysates of HeLa cells transfected with the indicated siRNA were 
examined by western blotting with anti-UFD1 or anti-HSP90 antisera. 

b, Resolved lysates of human diploid fibroblasts transfected with the indicted 
siRNA and treated for the indicated times with epidermal growth factor 

(20 ng ml ') were examined by western blotting with anti-EGFR, anti-UFD1 
and anti-GAPDH antisera. EGFR degradation was quantified by densitometry 
(n = 3). c Resolved cell lysates from 293T cells transfected with the indicated 
HIV-1 based lentiviral plasmids, a virally packaged GFP-plasmid, and the 
indicated siRNA were examined by western blotting with anti-p24 capsid, 
-HSP90, -TSG101, -CHMP2A, -CHMP2B and -UFD1 antibodies. Viral 
supernatants were collected and used to infect target HeLa cells. Resolved 
virions present in the 293T supernatant were examined by western blotting 
with anti-p24 capsid. Resolved lysates of infected HeLa cells were examined by 
western blotting with anti-GFP. Virion release was the ratio of released to 
cellular p24 capsid, as quantified by densitometry (n = 2); infectivity was 
quantified as GFP signal in target cells, as quantified by densitometry (n = 2). 
d, siRNA-transfected HeLa cells were fixed and stained with anti-tubulin. 
Multinucleate cells (1 = 5) or cells connected by midbodies (n = 5) were scored 
visually, 300 cells scored per experiment. Data are mean + s.d. 
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Extended Data Figure 7 | ESCRT depletion impairs NE-rim formation. 

a, b, Timelapse microscopy analysis and quantification of NE-rim formation in 
HeLa cells stably expressing YFP-LAP2B and mCh-H2B and treated with the 
indicated siRNA. Scale bars, 10 pm. Time for rim formation post anaphase 
onset given (mins) (control, 8.53 = 0.09, 226 cells analysed over 8 independent 
experiments; CHMP2A-1, 7.60 + 0.09, 205 cells analysed over 7 independent 
experiments; CHMP2A-2, 6.86 + 0.12, 37 cells analysed over 2 independent 
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experiments; CHMP2B, 6.92 + 0.09, 79 cells analysed over 4 independent 
experiments; CHMP2A and CHMP2B, 6.84 + 0.13, 50 cells analysed 

over 2 independent experiments; CHMP4B, 7.07 + 0.14, 44 cells analysed 
over 2 independent experiments; UFD1, 9.2 + 0.18, 39 cells analysed over 

3 independent experiments). Data are mean ~ s.e.m. (in minutes). Images 
representative of the indicated number of cell analysed. c, Resolved cell lysates 
from a were analysed by western blotting with the indicated antisera. 
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Extended Data Figure 8 | ESCRT depletion does not impair nuclear pore 
formation. a, Schematic of nuclear envelope integrity assay. b, Control- 
siRNA-treated HeLa cells reporting nucleo-cytoplasmic partitioning using the 
GFP-NLS-fGal assay, average NE compartmentalization from 20 cells 
presented. Nucleo-cytoplasmic partitioning stabilizes at 85 min (indicated by 
arrow). c, Immunofluorescence analysis of HeLa cells stably expressing YFP- 
LAP2f, transfected with the indicated siRNA then stained with anti-mAb414 
and DAPI (n = 3). Scale bars, 10 jum. d, Mask used to quantify nuclear pore 


formation by image-based flowcytometry (Imagestream). e, Imagestream 
analysis of HeLa cells transfected with the indicated siRNA, then stained with 
anti-mAb414 and DAPI. Nuclear pore intensity quantified by mask described 
in d. Representative images from two independent experiments, histogram 
and population averages displayed, graphical quantification of NPC intensity 
from the indicated number of gated cells (control, 3,045; CHMP2A-1, 1,256; 
CHMP2A-2, 2,152; CHMP2B, 5,237; UFD1-1, 4,146; UFD1-3, 4,325). Data are 
mean + s.d. 
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Extended Data Figure 9 | Requirements for nucleo-cytoplasmic 
compartmentalization. a, Quantification of NE sealing from siRNA-treated 
cells as in Fig. 4b (control, 140 cells from 7 independent experiments; UFD1-1, 
60 cells from 3 independent experiments, P = 0.044; UFD1-3, 60 cells from 3 
independent experiments, P = 0.021; CHMP2B 40 cells from 2 independent 
experiments; two-tailed Student’s t-test was used to assess significance at the 
85-min time point). b, Resolved cell lysates from a were analysed by western 
blotting with the indicated antisera. c, NE integrity assay as performed with cells 
stably expressing mCh-H2B and GFP-NLS and transfected with the indicated 
siRNA. Differences in nucleo-cytoplasmic partitioning was assessed after 
plateau at the 65-min time point using a two-tailed Student’s f-test (control, 79 
cells from 4 independent experiments, CHMP2A-1, 60 cells from 3 
independent experiments, P = 0.048; CHMP2A-2, 52 cells from 3 independent 
experiments, P = 0.011; CHMP3, 28 cells from 3 independent experiments, 
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P = 0.028). d, e, HeLa cells stably expressing mCh-H2B and GFP-NLS were 
transfected with the indicated siRNA and imaged live. 60 min after anaphase 
onset, cytoplasmic signal was photo-ablated (T = 0) and recovery of 
cytoplasmic signal from the nuclear pool was calculated for the indicated 
conditions (cytoplasmic:nuclear ratio of GFP-NLS was normalized to T = 0; 
control, 21 cells from 4 independent experiments; CHMP2A-1, 24 cells from 4 
independent experiments, P = 0.04; CHMP2A-2, 23 cells from 4 independent 
experiments, P = 0.05; CHMP3, 15 cells from 3 independent experiments, 

P = 0.004, two-tailed Student’s t-test was used to assess significance after 

10 min). Scale bars, 10 um. f, Scoring of multinucleate and midbody-connected 
HeLa cells transfected with the indicated siRNA and stained with anti-tubulin 
and DAPI (300 cells analysed per condition, n = 4). Data are mean + s.e.m. 
(a, c, d) and mean = s.d. (f). 
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Histone H3.3 is required for endogenous retroviral 
element silencing in embryonic stem cells 


Simon J. Elsisser’**, Kyung-Min Noh’, Nichole Diaz*, C. David Allis* & Laura A. Banaszynski 


Transposable elements comprise roughly 40% of mammalian gen- 
omes’. They have an active role in genetic variation, adaptation and 
evolution through the duplication or deletion of genes or their 
regulatory elements” *, and transposable elements themselves can 
act as alternative promoters for nearby genes, resulting in non- 
canonical regulation of transcription®®. However, transposable 
element activity can lead to detrimental genome instability’, and 
hosts have evolved mechanisms to silence transposable element 
mobility appropriately*”. Recent studies have demonstrated that 
a subset of transposable elements, endogenous retroviral elements 
(ERVs) containing long terminal repeats (LTRs), are silenced 
through trimethylation of histone H3 on lysine 9 (H3K9me3) by 
ESET (also known as SETDB1 or KMT1E)” and a co-repressor 
complex containing KRAB-associated protein 1 (KAP1; also 
known as TRIM28)"' in mouse embryonic stem cells. Here we show 
that the replacement histone variant H3.3 is enriched at class I and 
class II ERVs, notably those of the early transposon (ETn)/MusD 
family and intracisternal A-type particles (IAPs). Deposition at a 
subset of these elements is dependent upon the H3.3 chaperone 
complex containing a-thalassaemia/mental retardation syndrome 
X-linked (ATRX)’ and _  death-domain-associated protein 
(DAXX)’?"**. We demonstrate that recruitment of DAXX, H3.3 
and KAP1 to ERVs is co-dependent and occurs upstream of 
ESET, linking H3.3 to ERV-associated H3K9me3. Importantly, 
H3K9me3 is reduced at ERVs upon H3.3 deletion, resulting in 
derepression and dysregulation of adjacent, endogenous genes, 
along with increased retrotransposition of IAPs. Our study iden- 
tifies a unique heterochromatin state marked by the presence of 
both H3.3 and H3K9me3, and establishes an important role for 
H3.3 in control of ERV retrotransposition in embryonic stem cells. 

Deposition of the histone variant H3.3 has been linked to regions of 
high nucleosome turnover and has been traditionally associated with 
gene activation. However, we and others have demonstrated that H3.3 
is incorporated into both facultative and constitutive heterochroma- 
tin’?!>"*, Here, we used chromatin immunoprecipitation followed by 
sequencing (ChIP-seq) to identify 79,532 regions of H3.3 enrichment 
across the entire mouse genome, including repetitive regions (see later 
and Methods for details of data analysis), and performed a hierarchical 
clustering of H3.3 with various chromatin modifications. Consistent 
with deposition at euchromatin and heterochromatin, we observe 
H3.3 associated with both active (for example, H3K4me3, H3K27ac, 
H3K4mel) and repressed (for example, H3K9me3, H3K27me3, 
H4K20me3) chromatin states (Fig. la). While most H3.3 peaks 
localized to genic regions and intergenic regulatory regions such as 
enhancers”, 23% (18,606/79,532) intersected with H3K9me3 peaks 
indicative of heterochromatic regions. Of these, 59% (11,010/18,606) 
localized to interspersed repeats (longer than 1 kb) and only 9% (1,747/ 
18,606) fell within genic regions (Fig. 1b). Sequential ChIP-seq 


3,4% 


(re-ChIP) demonstrated co-enrichment of H3.3 and H3K9me3 at 
these regions (Fig. 1c). 

To identify repeat families that were associated with H3.3, we 
mapped our H3.3 ChIP-seq data to a comprehensive database of 
murine repetitive sequences'”’. Unbiased hierarchical clustering 
demonstrated a striking correlation between H3.3, H3K9me3 and 
H3.3—H3K9me3 re-ChIP over class I and II ERVs, as well as enrich- 
ment of known silencing factors KAP1 and ESET (Fig. 1d and 
Extended Data Fig. 1). Class III ERVs and non-LTR long interspersed 
nuclear elements (LINEs) and short interspersed nuclear elements 
(SINEs) carry little H3.3 and H3K9me3 but higher levels of 
H3K9me2. However, the promoter/5’ untranslated region (UTR) of 
intact LINE] elements are enriched with H3.3, H3K9me3, KAP1 and 
ESET (Fig. 1d and Extended Data Fig. 1), suggesting a related mech- 
anism of repression. Analysing individual well-annotated integration 
sites of ERVs°”°, we found that IAP and ETn/MusD ERVs, the most 
active transposons in the mouse genome”, are significantly enriched 
in H3.3 and H3K9me3 (Extended Data Fig. 2a—c), with 94% of [AP and 
53% of ETn ERVs enriched with both H3.3 and H3K9me3 (Extended 
Data Fig. 2d). 

Repetitive regions provide a challenge to next-generation sequen- 
cing analysis due to the ambiguity arising from mapping short reads to 
non-unique sequences. Standard ChIP-seq alignments disregard reads 
that map to more than a single location in the genome, leaving gaps 
wherever the underlying sequence is non-unique (Fig. le). To include 
interspersed repeats, we allowed random assignment of ambiguously 
mappable reads to one of the best matches” (Fig. le), effectively aver- 
aging counts over multiple occurrences of the same exact read match. 
As exemplified by ETn and IAP insertions downstream of the Vnn3 
transcription start site, H3K9me3 is broadly enriched over the non- 
unique ERV sequence, whereas H3.3 appears to be more confined over 
3’ and 5’ regions of the repeats (Fig. le). Neither ChIP-seq using an 
antibody recognizing only the canonical H3 isoforms (H3.1/2) nor an 
antibody recognizing all H3 isoforms (total H3; H3.3 constitutes 
~10% of total H3 in embryonic stem (ES) cells) show enrichment at 
the corresponding regions (Fig. le), and H3.3 enrichment was lost in 
ES cells lacking H3.3 (Extended Data Fig. 3)'*. We were further able to 
detect both H3.3 and H3K9me3 in the uniquely mappable flanking 
sites of IAP and ETn ERVs, (Extended Data Fig. 4a, b). In addition to 
full ERVs, we found single (so-called ‘orphan’) LTRs to be enriched in 
both H3.3 and H3K9me3 (Extended Data Fig. 4c), suggesting that the 
LTR sequence itself is sufficient for the nucleation of H3.3 and hetero- 
chromatin factors. 

H3.3 deposition has been linked to dynamic chromatin regions with 
high levels of nucleosome turnover and DNA accessibility. As H3.3 
enrichment at ETn and IAP ERVs was comparable to levels found at 
active promoters in ES cells (Extended Data Figs 2a and 5a; compare 
also to Rps12 enrichment in Fig. le), we tested whether ERVs were 
nucleosome-depleted in ES cells. Surprisingly, we found that ERVs 
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Figure 1 | H3.3 is co-enriched with H3K9me3 at class I and II ERV- 
associated heterochromatin. a, Hierarchical (Spearman rank) clustering of 
H3.3 peaks on chromosome 1 with histone modifications associated with active 
(green) or repressed (red) chromatin states. Annotated genes and ERVs are 
shown. b, Venn diagram of H3.3 and H3K9me3 peaks demonstrating overlap 
at repetitive elements. c, ChIP-seq density heat maps for peaks classified as H3.3 
only (n = 60,925), both H3.3 and H3K9me3 (n = 18,605), or H3K9me3 only 
(n = 54,204). Colour intensity represents normalized and globally scaled tag 
counts. d, ChIP-seq enrichment of H3.3 and heterochromatic histone 
modifications and factors mapped to the repetitive genome. Data are 


showed low DNA accessibility compared to promoters of highly 
expressed genes with comparable H3.3 enrichment, as measured by 
DNase and MNase digestion”, and showed no signs of transcription as 
judged by RNA polymerase (Pol) II occupancy’? (Extended Data Fig. 
5a). Notably, we find that newly synthesized H3.3 (ref. 26) is rapidly 
incorporated at IAPs, despite the high levels of H3K9me3 and silent 
state (Extended Data Fig. 5b). Overall, our data suggest that a substan- 
tial fraction of H3.3 resides at ERVs in ES cells and constitutes a unique 
chromatin state fundamentally distinct from previously described 
combinations of histone variants and modifications. 

Previous studies have demonstrated that silencing of ERVs via 
H3K9me3 is unique to the pluripotent or embryonic state, with adult 
somatic tissues showing dependence upon DNA methylation for ERV 
repression. Concomitant with loss of H3K9me3, H3.3 enrichment is 
lost from IAP and ETn ERVs upon differentiation from ES cells to 
neuronal precursor cells (NPCs) (Fig. 1f and Extended Data Fig. 6a, b). 
These data indicate that, like H3K9me3, H3.3 may have a role in the 
embryonic establishment, but not the somatic maintenance, of this 
silenced chromatin state. Unlike H3K9me3, H3.3 is retained at telo- 
meres upon differentiation (Fig. 1f), suggesting uncoupled or alterna- 
tive mechanisms of repression from those functioning at ERVs. 

H3K9me3 is facilitated by two histone methyltransferases— 
ESET and SUV39h1/2—that display distinct properties and regions 


represented in a hierarchically (Spearman rank) clustered heat map of log, fold 
enrichment (red) or depletion (blue) over a matched input. See Extended Data 
Fig. 1 for complete heat map. e, Genome browser ChIP-seq representations in 
ES cells. Read counts are normalized to total number of reads for each data set 
and exclude (‘unique’) or include (‘inclusive’) repetitive reads. MTA, MT 
subfamily A. f, ChIP-seq enrichment of H3.3 and H3K9me3 at various repeat 
regions in ES cells (ESCs) and NPCs. Data are represented as in d. g, Levels of 
co-enriched H3.3-H3K9me3 in control and ESET conditional knockout (cKO) 
ES cells. [APEz, [AP subfamily Ez; WT, wild type. ****P < 0.0001, one-sided 
Wilcoxon signed rank test. NS, not significant. 


of genomic activity. Previous studies demonstrate that ESET has a 
critical role in the establishment of H3K9me3 at a large number of 
ERVs’°, while SUV39h1/2 is involved in the maintenance and spread- 
ing of H3K9me3 at a subset of repeat elements’’. To elucidate which 
methyltransferase was responsible for establishing H3.3/H3K9me3 
heterochromatin, we analysed the effect of ESET and SUV39h1/2 
knockout on H3K9me3 levels at H3.3-containing ERVs. We found 
that ESET was required for H3K9me3 at all H3.3-containing classes 
of repeats (Fig. 1g and Extended Data Fig. 6c). SUV39h1/2 deletion 
resulted in a small decrease of H3K9me3 at IAP and ETn/MusD ele- 
ments, but greatly decreased H3K9me3 at intact LINE elements, 
including their 5' UTR (Extended Data Fig. 6c). In conclusion, the 
co-occurence of H3.3 and H3K9me3 facilitated by ESET methyltrans- 
ferase activity defines a novel class of heterochromatin that functions 
at ERVs and intact LINE1 5’ ends. 

The histone variant H3.3 is incorporated at distinct regions of chro- 
matin by either the HIRA or ATRX-DAXX histone chaperone com- 
plexes'*""*. We and others previously demonstrated that HIRA is 
responsible for H3.3 enrichment at genic regions, while the ATRX- 
DAXX complex facilitates H3.3 deposition at simple repeat regions 
such as telomeres’*”*'*. Using ChIP-seq, we found that DAXX and 
ATRX were responsible for H3.3 incorporation at regions enriched 
with both H3.3 and H3K9me3, whereas HIRA facilitated deposition at 
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Figure 2 | DAXX-ATRX is responsible for H3.3 deposition at a subset of 
ERVs and co-localizes with ERV-specific heterochromatic factors. a, ChIP- 
seq density heat maps for peaks classified as both H3.3 and H3K9me3 (n = 
18,605) or H3.3 only (n = 60,925). Colour intensity represents normalized and 
globally scaled tag counts. WT, wild type. b, ChIP-seq enrichment of H3.3 
chaperones and chaperone-dependent H3.3 deposition at repetitive regions. 
Data are represented in a heat map of log, fold enrichment (red) or depletion 
(blue) over a matched input. c, Venn diagram of H3.3, H3K9me3 and KAP1 
peaks demonstrating substantial overlap in ES cells. d, Levels of H3.3 in control 
and KAP1 conditional knockout (cKO; top) and control and ESET cKO 
(bottom) ES cells. ****P < 0.0001, *P < 0.05, one-sided Wilcoxon signed rank 


regions enriched with H3.3 alone (Fig. 2a). ATRX and DAXX deletion, 
but not HIRA, attenuated H3.3 enrichment at telomeres as well as at 
IAP ERVs, but not at ETn/MusD ERVs (Fig. 2b and Extended Data 
Fig. 7a, b), indicating that ATRX-DAXxX is required for H3.3 enrich- 
ment at specific subclasses of ERVs. ChIP-seq analysis at repeats 
demonstrated that both DAXX and ATRX co-occupied class I and II 
ERVs enriched with KAP1 and ESET, as well as telomeres (Fig. 2b). To 
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Figure 3 | H3.3 is required for the maintenance of H3K9me3 at specific class 
I and II ERVs. a, Levels of H3K9me3 and total H3 in control and H3.3 
knockout (KO) ES cells. ****P < 0.0001, one-sided Wilcoxon signed rank test. 
NS, not significant. b, ChIP with quantitative polymerase chain reaction 
(qPCR) analysis of H3K9me3 enrichment at various repeat regions in control 
ES cells and H3.3-KO ES cells exogenously expressing either H3.2 or H3.3. 
rDNA, ribosomal DNA. Error bars represent standard deviation (s.d.) from one 
experiment (n = 3). Data are representative of three independent ChIP 
experiments. *P < 0.05, **P < 0.01, ***P < 0.001, t-test. 


242 | NATURE | VOL 522 | 11 JUNE 2015 


c d_sH3.3 ChIP ol 
2 
5 
= 
ChIP-seq peaks na 
4 
53 
Qa 
£2 
$ 1 
> 0 
7) iS 
i 
g 2 ey T/T, 
Ee eel t= | 
® toy itt AB 
a 7 tla 
o {ESET |* 
OEO EO EO 
g = 2g 28 
BB H3.1/2 
BH3.3 Biochemical 


@ H3K9me3 _ interaction 


test. NS, not significant. e, Immunoblotting of DAXX immunoprecipitated 
from wild-type or H3.3-null nuclear extracts showing co-immunoprecipitation 
with ATRX, H3.3, H3K9me3 and KAP1 independent of H3.3 (1% input). 
Asterisk denotes cross-reacting band. f, Levels of KAP1 in control and H3.3- 
knockout ES cells. Data are presented as in d. g, Model of corepressor complex 
function at IAPs: KAP1 recognizes ERVs through sequence-specific KRAB zinc 
finger (ZNF) DNA-binding proteins and recruits DAXX-ATRX independently 
of its interaction with ESET. DAXX-ATRX deposit H3.3 at IAPs, facilitating 
efficient KAP1 association with chromatin. ESET is then recruited, resulting in 
H3K9me3-mediated silencing of ERVs. 


understand further the relationship between the corepressor KAP1 
and ATRX-DAXX-dependent H3.3 deposition at ERVs, we mapped 
genome-wide enrichment of KAP1 and found that almost half 
(13,730/29,185) of the KAP1 peaks coincided with shared H3.3/ 
H3K9me3 peaks (Fig. 2c). We therefore wanted to determine whether 
KAP1 had a role in targeting H3.3 deposition via recruitment of 
ATRX-DAXxX. Indeed, H3.3 enrichment was reduced at IAP ERVs 
in the absence of KAP1 but was independent of ESET (Fig. 2d and 
Extended Data Fig. 7c-e), suggesting a novel role for KAP] in recruit- 
ment of ATRX-DAXX. 

To determine whether KAP1 and ATRX-DAXX associated bio- 
chemically, we prepared nuclear extracts from ES cells. We found that 
DAXX co-immunoprecipitated its known complex member ATRX as 
well as its substrate H3.3 (Fig. 2e). Of note, DAXX-associated histone 
was enriched with H3K9me3 (Fig. 2e). In addition, DAXX co-immu- 
noprecipitated KAP1 (Fig. 2e), suggesting that DAXX-ATRX and 
KAPI1 can form a biochemical complex. HIRA was not co-immuno- 
precipitated, demonstrating the specificity of the interaction. Given the 
requirement of H3.3 for DAXX folding”’, we repeated DAXX immu- 
noprecipitation from two independent ES cell lines lacking H3.3 
(Fig. 2e)’®. While overall nuclear DAXX levels were reduced in the 
absence of H3.3 (Fig. 2e and Extended Data Fig. 7f), in agreement with 
a co-folding mechanism, the low levels of remaining DAXX main- 
tained association with KAP1 (Fig. 2e), suggesting an interaction inde- 
pendent of the H3.3 substrate. 

We next wanted to determine whether the loss of H3.3 affected 
KAP1 or DAXX targeting to ERVs. Intriguingly, both KAP1 and 
DAXX recruitment to ERVs was reduced in the absence of H3.3, 
and telomere association was lost (Fig. 2f and Extended Data 
Fig. 7g). We cannot distinguish, however, if reduced enrichment of 
DAXX at chromatin is a result of KAP1 impairment or a direct 
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Figure 4 | Loss of H3.3 leads to ERV derepression. a, RNA-seq analysis of 
repeat transcripts. Data are represented as log, change in transcript over control 
for H3.3 KO1 and KO2 ES cells. b, RNA-seq analysis of transcripts with nearby 
ERVs that are significantly upregulated in H3.3 KO1 and KO2 (q < 0.05) (see 
Extended Data Fig. 10a). Data are represented as in a. Nearby ERV classes are 
indicated. FPKM, fragments per kilobase of transcript per million fragments 
mapped. c, Representative example of an upregulated transcript in the absence of 
H3.3. RNA-seq tracks (black) show all reads mappable to the genome (without 
restriction to known transcripts). H3K9me3 (purple and violet) and H3.3 (red) 
tracks show inclusive reads as standardized read densities. The relative 
H3K9me3 difference between wild type (WT) and knockout (KO) is shown ina 
separate track (‘difference’). d, Paired-end-based de novo discovery of non- 
annotated IAP integration sites in control and H3.3 KO1 ES cells (for details see 
Methods). Venn diagram and PCR genotyping validation of non-annotated [AP 
integration sites in control and H3.3 KO1 ES cells. PR, paired read. 


consequence of reduced DAXX protein stability in the absence of H3.3. 
Together, these data suggest that H3.3, DAXX and KAP] are coopera- 
tive in their function related to ERV silencing (Fig. 2g). Intriguingly, 
while H3.3 enhances KAP1 and DAXX recruitment to ETn/MusD 
elements (Fig. 2f and Extended Data Fig. 7g), the variant remains 
enriched at these elements in the absence of the corepressor complex 
(Fig. 2b, d and Extended Data Fig. 7c-e). 

As we observed a positive correlation between H3K9me3 and H3.3 
enrichment at IAP ERVs (Extended Data Figs 4a and 8a), we next tested 
whether there was a functional link between H3.3 deposition and 
H3K9me3 establishment at specific subclasses of ERVs. Although glo- 
bal levels of H3K9me3 were not affected by the loss of H3.3 (Extended 
Data Fig. 8b), we found that H3K9me3 was reduced specifically at peaks 
enriched with both H3.3 and H3K9me3, concomitant with a reduction 
of KAP1 occupancy (Extended Data Fig. 8c). Indeed, H3K9me3 levels 
were reduced up to 50% at IAP, ETn and MusD repeats in the absence 
of H3.3 (Fig. 3a). Importantly, nucleosome density was not reduced, as 
evidenced by the overall maintenance of total H3 (Fig. 3a). Intriguingly, 
H3K9me3 levels were reduced at ETn/MusD elements in the absence of 
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DAXX, ATRX, KAP1 or ESET (Extended Data Fig. 8d—h), whereas 
H3.3 enrichment at these elements was independent of the corepressor 
complex (Fig. 2 and Extended Data Fig. 7), suggesting a multifaceted 
mechanism in which both H3.3 deposition and corepressor complex 
recruitment contribute to ERV silencing. 

Intriguingly, ERVs retained H3.3 to a larger extent than other 
regions in ES cells RNA interference (RNAi)-depleted of H3.3 (H3.3 
knockdown’; Extended Data Fig. 9a—c), suggesting they may act as 
‘sinks’ for the remaining low levels of H3.3 present in H3.3-knock- 
down ES cells. Furthermore, exogenously expressed H3.3, but not 
H3.2, in both H3.3-knockout and H3.3-knockdown ES cells was 
focally enriched at IAP ERVs (Extended Data Fig. 9d-f). 
Importantly, exogenous expression of H3.3, but not H3.2, was able 
partially to rescue the loss of H3K9me3 at specific repeat regions 
(Fig. 3b). Together, these data suggest a direct and variant-specific role 
for H3.3 in establishing H3K9me3 chromatin at a subset of ERVs that 
cannot be compensated by the canonical H3.1/2 isoforms. 

As H3K9me3 is known to be required for silencing of ERVs!°, we 
tested whether loss of H3.3 would cause a derepression of ERVs con- 
comitant with a reduction of H3K9me3 levels. RNA-sequencing 
(RNA-seq) demonstrated a moderate increase in global transcripts 
from IAPs, but not ETn/MusD ERVs (Fig. 4a). Since ERVs have 
recently been shown to control expression of nearby genes*®, we next 
tested whether endogenous genes that were deregulated in H3.3- 
knockout ES cells were proximal to ERVs. While the majority of 
ERVs are neutral to neighbouring genes, a number of genes in the 
vicinity of ERVs were highly upregulated (Fig. 4b and Extended 
Data Fig. 10a), including the known gene Cyp2b23 and a new putative 
chimaeric transcript originating from a MusD element within the Aass 
gene (Fig. 4c). Notably, the same set of transcripts was upregulated in 
H3.3-depleted ES cells, albeit at a lower level (Fig. 4b), suggesting that 
the remaining H3.3 is partially functional in maintaining silent ERVs. 

We hypothesized that ERV desilencing should result in increased 
ERV mobility. Paired-end sequencing of genomic DNA identified 80 
non-annotated IAP integrations unique to H3.3-knockout ES cells, and 
only 17 unique to wild-type ES cells (Fig. 4d and Extended Data Fig. 10b). 
As derepressed IAPs have been shown to cause chromosome rearrange- 
ments, we analysed H3.3-knockout ES cells for increased genome instab- 
ility. Indeed, karyotypic analysis of H3.3-knockout ES cells showed a 
number of chromosomal abnormalities not observed in the wild-type 
control (Extended Data Fig. 10c). Despite these observations, we cannot 
exclude that genomic instability in H3.3-knockout ES cells might result 
from a loss of function unrelated to retrotransposon silencing”. 

We have uncovered an unexpected role for the histone variant H3.3 
in the establishment of heterochromatin. We demonstrate a hierarchy 
for deposition of H3.3, favouring DAXX-ATRX-mediated chromatin 
assembly at ERVs over transcription-associated deposition. We pro- 
pose a model in which H3.3-containing chromatin facilitates the 
recruitment of KAP1 to ERVs, which in turn recruits DAXX-ATRX 
for the maintenance of H3.3 chromatin, thus creating a positive feed- 
back or propagation loop. This mechanism acts synergistically with 
ESET-mediated H3K9me3 in maintaining a silent chromatin state at 
ERVs. Our data also indicate an H3.3-independent function of 
DAXX-ATRX in maintaining H3K9me3, possibly related to an archi- 
tectural role in a larger corepressor complex with KAP1 and ESET. 
Our findings solidify an emerging understanding of the importance of 
the histone variant H3.3 in the establishment of silenced chromatin 
states and in maintenance of genome stability. 


Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 
ES cell culture. ES cells were cultured under standard conditions (KO-DMEM, 
2 mM Glutamax, 15% ES grade fetal bovine serum, 0.1 mM 2-mercaptoethanol 
and leukaemia inhibitory factor (LIF)). H3.3-knockout/knockdown ES cells were 
C57B1/6] background. H3.3-knockout ES cells were a mixed 129 C57BL/6] back- 
ground. Generation of H3.3-knockout/knockdown and H3.3-knockout ES cells 
were previously described"*. For early passages, cells were maintained on an irra- 
diated feeder layer. To remove feeders, cells were passaged at least two passages off 
of feeders onto gelatin-coated plates. ES cells were routinely tested for mycoplasma. 
ChIP. Native ChIP assays (H3K9me3, H3.3-HA) were performed with approxi- 
mately 2 X 10’ ES cells per experiment. Cells were subject to hypotonic lysis and 
treated with micrococcal nuclease to recover mono- to tri-nucleosomes. Nuclei 
were lysed by brief sonication and dialysed into N-ChIP buffer (10 mM Tris pH 
7.6, 1 mM EDTA, 0.1% SDS, 0.1% Na-Deoxycholate, 1% Triton X-100) for 2 h at 
4 °C. Soluble material was recovered (~70%) and incubated with 3-5 pg of 
antibody bound to 75 pl protein A Dynal magnetic beads (Invitrogen) and incu- 
bated overnight at 4 °C, with 5% kept as input DNA. Magnetic beads were washed, 
chromatin was eluted, and ChIP DNA was dissolved in 10 mM Tris pH 8. 
Crosslinking ChIP assays (H3gen, H3.1/2, H3.3, H3K9me3, DAXX, KAP1) 
were performed with approximately 2 X 10” ES cells per experiment. Cells were 
crosslinked with 1% paraformaldehyde (PFA) for 10 min at room temperature and 
quenched by glycine at a final concentration of 0.125 M. Chromatin was sonicated 
to an average size of 0.3-0.7 kb using a Biorupter (Diagenode). Purified nuclei were 
resuspended in X-ChIP buffer (10 mM Tris pH 8, 100 mM NaCl, 1 mM EDTA, 0.5 
mM EGTA, 0.1% Na-Deoxycholate, 0.5% N-lauroylsarcosine) and incubated with 
3-5 tg of antibody bound to 75 jl protein A Dynal magnetic beads (Invitrogen) 
and incubated overnight at 4 °C, with 5% kept as input DNA. Magnetic beads were 
washed, chromatin was eluted, and ChIP DNA was dissolved in 10 mM Tris pH 8. 
Antibodies. H3 general (ab1791, Abcam), H3.3 (09-838, Millipore), H3.1/2 
(ABE154, Millipore), H4 (rabbit antiserum), H3K9me3 (ab8898, Abcam), HIRA 
(mouse monoclonal WC15 and WC119), DAXX (sc-7152, Santa Cruz 
Biotechnology), ATRX (sc-15408, Santa Cruz Biotechnology), KAP1 (ab22553, 
Abcam; ab10483, Abcam), tubulin (TUB2.1, Sigma), lamin (ab26300, Abcam), 
normal rabbit IgG (12-370, Millipore). 
Nuclear extract preparation. ES cells were harvested from 60 15-cm dishes at 
80% confluency. Cell pellets were resuspended in 150 ml hypotonic lysis buffer 
(20 mM HEPES pH 7.9, 10 mM KCl, 5 mM MgCh, 0.5 mM EGTA, 0.1 mM 
EDTA, 5 mM 2-mercaptoethanol, 0.4 mM PMSF) and homogenized (dounce 
10x A, 5X B). Cell lysis was confirmed by trypan blue staining. Nuclei were 
harvested for 5 min at 1,900g at 4 °C. Nuclei were resuspended in 45 ml buffer 
(20 mM HEPES pH 7.9, 110 mM KCl, 2 mM MgCl, 0.1 mM EDTA, 5 mM 
2-mercaptoethanol, 0.4 mM PMSF, 1X complete protease inhibitor cocktail 
(Roche)). One-tenth volume saturated (NH4).SO, pH 7.5 (final concentration 
~400 mM) was added and lysates were incubated for 20 min at 4 °C rotating. 
Lysates were clarified for 90 min at 35,000 r.p.m. at 4 °C. Protein complexes were 
precipitated by slow addition of (NH4)2SO, to 60% saturation and collected for 10 
min at 13,000 r.p.m. at 4 °C. Precipitated complexes were resuspended by dialysis 
in immunoprecipitation buffer (20 mM HEPES pH 7.9, 200 mM KCl, 0.01% NP- 
40, 5 mM 2-mercaptoethanol, 0.4 mM PMSF) and concentration was determined 
by Bradford assay. 
Immunoprecipitation. Five micrograms antibody bound to 25 jl Dynabeads was 
incubated with 1 mg of ES cell nuclear extract for 3 h at 4 °C. Beads were washed 
four times with 1 ml buffer (20 mM HEPES pH 7.9, 400 mM KCl, 0.01% NP-40, 5 
mM 2-mercaptoethanol, 0.4 mM PMSF) and eluted in 1X SDS loading buffer. 
Data sets. The following published next-generation sequencing data sets were 
meta-analysed in this study: (1) ChIP-RNA Pol II (CTD4H8), H3.3-HA in 
HIRA wild-type, HIRA-null, C57BL/6] ES cells and NPCs’; ATRX*!; ESET”; 
H3K9me3 and SUV39h1/2 (ref. 27); H4K20me3 (ref. 17); H3K9me2 (ref. 33); 
H3K4mel, H3K4me3, H3K27me3, H3K27ac, H3.3-HA from C57B1/6J ES cells'®; 
(2) DNase I Hypersensitivity (ENCODE U Wash), MNase accessibility”*; and (3) 
RNA-seq in H3.3B-HA and H3.3-knockout/knockdown C57BL/6] ES cells'®. 
Data sets used for individual figure panels are described in Supplementary 
Table 1. 
ChIP-seq analysis. ChIP-seq libraries were prepared according to the Illumina 
protocol and sequenced with HiSeq 2000. Raw reads in FASTQ format were 
aligned to the mouse genome version mm9 with Bowtie™ using “-m 1 --best” 
parameters for unique alignments and “-M 1 --best” parameters for inclusive 
alignment of non-unique reads. The former parameters instruct Bowtie to report 
a maximum of one match per read and discard any read that cannot be mapped to 
a unique best match. The “-M 1 --best” parameters ensure that only one alignment 
is reported for each read. This is either the single best alignment or, if more than 
one equivalent best alignment is found, one of those matches selected randomly. 
Input DNA mapped using the latter parameters extends evenly over the repeat 
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classes analysed in this study (namely IAP, ETn, MusD and L1 elements), con- 
firming a proportional representation of those repetitive sequences relative to the 
unique genome (Extended Data Fig. 2a, b). 

Bowtie SAM output files were converted to sorted BAM files using SAMtools”. 
For unique alignments, duplicate reads were filtered using the rmdup function of 
SAMtools. Wig files were created from BAM files using IGVTools count function 
(The Broad Institute) and scaled to a genome-wide average read density of 1 using 
java-genomic-toolkit wigmath.Scale function (as a reference, 17.5 Mio mapped 
reads at a fragment size of 150 bp yield an average genome-wide read density of 1 
for mm9). Figures of these continuous tag counts over selected genomic intervals 
were created in the IGV browser (The Broad Institute). 

Repetitive genome ChIP-seq analysis. The current build of rodent repeat 
sequences was downloaded from Repbase (http://www. girinst.org/repbase/) and 
filtered for Mus musculus sequences. A Bowtie index was created with Bowtie- 
build. Raw ChIP-seq FASTQ reads were mapped to the repetitive sequence data- 
base using Bowtie “--best” and “-k 1” options. A table of mapped short read counts 
per repetitive element were extracted from bam file using SAMtools idxtats func- 
tion. Further analysis was performed with R and visualized as heat maps using 
GENE-E. Mapped read counts were expressed as a fraction of total mapped 
repetitive reads for each sample. For enrichment analysis, normalized read counts 
of ChIP samples were divided by normalized read counts of a matched input 
sample and expressed as log, fold enrichment. In addition, the following quality 
controls were performed: read distribution across the repetitive sequence was 
inspected using IGV genome browser for each repeat family to confirm coverage 
of the whole repetitive sequence. To avoid over- or underestimating fold enrich- 
ments due to low sequence representation, repetitive sequences with consistently 
less than ~ 100 mapped reads per sample or control were excluded from analysis. 
Peak calling. Peaks were called for H3.3, H3K9me3 and total H3 ChIP-seq data 
sets from control C57BL/6) ES cells'®, including non-unique reads. MACS ChIP- 
seq peak finding was performed against a matched input using cut-off values 
“--pvalue le-6 --mfold 10,50”. 79,532, 72,811 and 29,189 peaks were called for 
H3.3, H3K9me3 and KAP1, respectively. For total H3, only 996 peaks were called 
with the same parameters. 

Enrichment analysis over H3.3 peaks. For Fig. 1a, enrichment of H3.3 and 
histone modifications over H3.3 peaks were calculated as follows: average ChIP- 
seq read densities over the peak interval defined in the MACS** bed output file 
were extracted from normalized wig files using the java-genomics-toolkit 
ngs.IntervalStats function (T. Palpant; http://palpant.us/java-genomics-toolkit/). 
ChIP-seq enrichment for each interval was normalized subsequently, dividing the 
mean read density of the ChIP-seq sample by the corresponding density of the 
matched input sample. Data were visualized in a heat map as log; fold enrichment 
over input and clustered with GENE-E (The Broad Institute). 

Enrichment analysis over repetitive and unique genomic regions. For Figs 1g, 
2d, f, 3a and Extended Data Figs 2a-c, 5, 6, 8d, 9c, f, intervals were derived from 
following sources: Transcription start sites (TSSs) of ~ 2,000 highly active genes 
previously shown to be enriched in H3.3 were defined as intervals from —1 kb to 
+1 kb around their annotated TSS. H3K27me3-containing promoters (K27pro) 
were previously characterized'®. Curated sets of IAP, IAPd, RLTR10, ETn, MusD, 
ERGLN, ERVK10C° and L1Md_A”’ repeat locations were used. Additional intact 
IAP elements were identified using the BLAT function of USCS and combined 
with the existing IAP data set. Intact LINE L1Md_F promoters/5’ UTRs were 
identified in the reference genome using BLAT with the RepBase sequence 
LIMM_F_L1. 

Enrichments over these intervals were calculated as described earlier from 
normalized ChIP and input wig files using wigmath.IntervalStats. Log, fold 
enrichments over individual intervals were summarized using R in boxplots 
(Tuckey box-and-whisker plots using R boxplot defaults). Specifically, the box 
indicates median, as well as upper and lower quartiles of the data. Whiskers extend 
to the most extreme datapoint within 1.5-times the interquartile range (IQR). 
Outliers are not shown. Significance levels were calculated using Wilcoxon tests: 
not significant, P > 0.05; *P < 0.05; **P < 0.01; ***P < 0.001; **** < 0.0001. 
Peak profile heat maps. For Figs 1c, 2a and Extended Data Figs 8c, 9e, peak profile 
heat maps were calculated using ngsplot”’ over a 5-kb window around the MACS 
peak centres (parameters: -SC global -I 0 -L 2500 -MQ 0 -RB 0.05) from BAM files 
using the inclusive mapping procedure. Data sets are normalized to total read 
counts and all maps are represented on the same global scale. 

Analysis regions flanking repetitive elements. For Extended Data Fig. 4, profiles 
were calculated from uniquely mapped reads only, that is, non-unique reads and 
PCR duplicates were discarded before calculating the coverage using IGVtools 
count function (see earlier). Profiles over flanking regions were aggregated using 
the sitepro function from the CEAS suite** with the following modifications: 
profiles were not centred over the element but instead separately collected for 
the 3’ and 5’ flanking regions. The mean of the profiles in two, 5’ and 3’, 500 
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bp windows was extracted for each interval as an approximation of enrichment 
over the central, repetitive, interval. Profiles were either visualized as heat map 
(using GENE-E), or averaged into a single plot (CEAS sitepro). Wig files were 
normalized to a global average of 1, thus the ordinate of the profile plot represents 
fold enrichment/depletion over a random genome-wide distribution of reads. 
RNA-seq preparation and analysis. RNA was isolated using QIAGEN RNeasy. 
Libraries were prepared according to the Illumina TruSeq protocol and were 
sequenced on the HiSeq 2000. Resulting reads (101 nucleotides) were aligned to 
the mouse genome (mm9) using TopHat”. Gene expression level measured as 
FPKM was determined by the maximum likelihood estimation method imple- 
mented in the Cufflinks software package with annotated transcripts as references. 
Differential expression was analysed using the Student’s t-test in the program 
Cuffdiff with P values corrected for multiple testing. 

De novo mapping of unannotated ERVs. Genomic DNA from H3.3 wild-type 
and KOI ES cells was sheared to an average of 500 bp. Illumina paired-end 
sequencing was performed with 50 bp read lengths. ERVs were mapped to the 
reference genome in a two-step procedure. First, all reads were mapped to a 
genome consisting of all RepBase sequences belonging to specific ERV class (for 
example, IAPs), using Bowtie2. Next, unpaired read pairs (where one mate 
matched an ERV sequence but the other could not be aligned) were extracted 
using samtools and mapped to the mm9 reference genome using Bowtie (allowing 
only for uniquely mappable reads). This strategy allowed us to anchor each ERV 
integration site with up to 10 uniquely mappable reads on either side of the 
repetitive sequence. Plus/minus-strand-specific wig coverage tracks were created 
using IGVTools, extending reads to 500 bp. We took advantage of the fact that left- 
hand anchor reads mapped exclusively to the plus strand and right-hand anchor 
reads to the minus strand. Thus, while existing ERVs were demarcated by a plus 
peak on the left and minus peaks on the right of the repeat sequence, non-anno- 
tated integration sites were characterized by a plus peak directly overlapping with a 
minus peak at the insertion site. Plus and minus peaks were identified separately 


using the FindOutlierRegion of Java genomics toolkit on split plus and minus wig 
files. Peak intervals were then intersected to find overlapping plus/minus peaks. 
Wild-type and KO1 ES cell peaks were intersected and new integration sites were 
only called if a plus/minus peak did not overlap with a minus or plus peak in the 
respective control data set. IAP integration sites were validated by genotyping, 
using primer pairs spanning a ~300 bp region between the IAP LTR and the 
unique flanking region. 
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Extended Data Figure 1 | H3.3 and H3K9me3 correlate within the mouse —_ mouse repetitive sequences (see Methods for details). Published data sets used 
repetitive ES cell genome. Related to Fig. 1. Hierarchically (Spearman rank) are listed in the Methods section. Data are represented as log, fold enrichment 
clustered heat map showing occupancy of histone H3.3 and known over matched inputs for each ChIP data set. Repeats with less then 0.01% 
heterochromatic histone modification and factors over a comprehensive set of abundance are omitted. 
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Extended Data Figure 2 | H3.3 and H3K9me3 co-occupy class I and II 
ERVs. Related to Fig. 1. a, Direct comparison of H3.3 enrichment at genic and 
repetitive sites. Box plot (top) showing enrichment of H3.3 over sets of intervals 
either representing genic or repetitive elements’ annotated in the reference 
genome, using inclusive read mapping. H3.3 ChIP was performed using an 
H3.3 antibody and formaldehyde (FA) crosslinking in H3.3 wild-type (WT) cell 
line. H3.3 enrichment is shown as standardized ChIP-seq read density divided 
by the standardized input read density on a per-interval basis. The width of the 
box is proportional to the number of intervals in each group. TSS, transcription 
start sites of highly active genes; K27pro, bivalent promoters’*®. Box plot 
(bottom) shows the input read density (standardized by scaling to a genome- 
wide mean of 1), confirming the even representation of unique and repetitive 
sequences resulting from the inclusive mapping procedure (see Methods for 
details). Result of one-sided Wilcoxon rank sum test against a set of randomly 
selected genomic intervals (shuffled) is indicated (****P < 0.0001). 

b, H3K9me3 enrichment at genic and repetitive sites. H3K9me3 ChIP was 
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performed using MNase digestion under native conditions. Box plot (top) 
showing enrichment of H3K9me3 over sets of intervals either representing 
genic or repetitive elements analogous to a. Box plot (bottom) shows the input 
read density analogous to a. Result of one-sided Wilcoxon rank sum test against 
a set of randomly selected genomic intervals (shuffled) is indicated (****P < 
0.0001). c, Sequential H3.3 and H3K9me3 (re)-ChIP at genic and repetitive 
sites. Boxplots showing enrichment of Re-ChIP inclusive read mapping relative 
to an input control. Result of one-sided Wilcoxon rank sum test against a set of 
randomly selected genomic intervals (shuffled) is indicated (****P < 0.0001). 
d, Co-occupancy of H3.3 and H3K9me3 at specific classes of ERVs. H3.3 and 
H3K9me3 peak intervals were independently intersected with annotated ERVs* 
and their co-occurrences within the same ERV were evaluated. L1Md_F (full) is 
a subset of L1IMd_F, comprising only full length repeats (>5 kb). All pie charts 
include total number of intervals for each family that had none, (at least) one 
H3.3 peak (H3.3 only), or H3K9me3 peak(s) (H3K9me3 only), or at least one of 
each (H3.3+H3K9me3). 
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Extended Data Figure 3 | Generation of H3.3-isoform-specific antibodies. 
Related to Fig. 1. a, Schematic of amino acid sequence differences for the 
canonical histones H3.1 and H3.2 versus the histone variant H3.3. H3.3 differs 
from H3.2 or H3.1 at only 4 or 5 amino acids, positions 31, 87, 89, 90 and 96, as 
indicated. b, Immunoblot against recombinant histones using the final purified 
antibody (Millipore 09-838), confirming specificity of the H3.3-isoform- 
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specific antibody. c, ChIP-qPCR analysis of H3.3 enrichment at various repeat 
regions in control and H3.3-knockout ES cells. Error bars represent s.d. from 
one experiment (1 = 3).d, ChIP-seq enrichment of H3.3 at repetitive regions of 
the mouse genome in control and H3.3-knockout ES cells. Data are represented 
in a heat map of log, fold enrichment (red) or depletion (blue) over a matched 
input. 
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Extended Data Figure 4 | H3.3 is enriched in regions flanking ERVs and annotated elements from standardized unique read count coverage tracks. The 
orphan LTRs. Related to Fig. 1. a, ChIP-seq density heat maps for unique sites _ profiles are directional with the 5’ ends on the left and 3’ end on the right. 
flanking full-length IAP ERVs (n = 800) rank ordered by H3K9me3 c, H3.3 (top) and H3K9me3 (bottom) enrichment over regions flanking single 
enrichment. Colour intensity represents normalized and globally scaled tag (so-called orphan) IAP LTRs, ~500 bp. Orphan LTRs are the result of a 
counts. b, H3.3 (top) and H3K9me3 (bottom) enrichment over regions flanking recombination event between two LTRs—usually the 3’ and 5’ LTRs of the 
IAP, ERVK10C, ETn ERVs and L1 elements. H3.3 ChIP-seq was performed same ERV—effectively deleting the internal coding sequence. Approximately 
with FA crosslinking, H3K9me3 ChIP-seq under native conditions. Average 600 full-length LTRs (~500 bp) enriched in H3.3 and H3K9me3 were 
profiles were aligned and aggregated at the 5’ and 3’ boundaries ofhundreds of _ identified in the mouse genome and aggregated for the profiles. 


©2015 Macmillan Publishers Limited. All rights reserved 


teh) 


DNase | 


MNase (short) MNase (long) 


a 


Read density over input 


IAP 


TSS IAP TSS IAP TSS 


Extended Data Figure 5 | H3.3 at IAPs is not associated with transcription, 
DNase I or MNase sensitivity. Related to Fig. 1. a, Direct comparison of 
chromatin properties at TSSs of highly expressed genes and IAP ERVs. Box 
plots showing (from left to right) comparable enrichment of H3.3; DNase I 
sensitivity; MNase sensitivity; elongating RNAP2 occupancy. MNase data sets 
are from a recent study, showing H3.3 localizing to MNase hypersentitive 
regions such as active promoters”. In this study, MNase sensitivity was assessed 
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by sequencing nucleosomes released under mild (‘short’) or extensive (‘long’) 
MNase digestion conditions; MNase hypersensitive sites were shown to be 
specifically enriched by mild MNase digestion, whereas long digestion released 
chromatin more evenly”. b, Comparison of kinetics of H3.3 incorporation”® at 
the TSS of highly expressed genes and IAP ERVs; as control, a randomized set 
of intervals is shown. 
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Extended Data Figure 6 | H3.3 and ESET-dependent H3K9me3 enrichment 
at IAPs is lost upon differentiation. Related to Fig. 1. a, b, Comparison of 
H3.3 (a) and H3K9me3 (b) enrichment at the TSS of highly expressed genes 
and various repeat classes in ES cells and NPCs using inclusive read mapping. 
H3.3 ChIP was performed using a genomic knock-in tagged H3.3B-HA and FA 
crosslinking’”. H3K9me3 ChIP was performed using FA crosslinking’’. 
Enrichment is shown as standardized ChIP-seq read density divided by the 


standardized input read density on a per interval basis. Result of one-sided 
Wilcoxon signed rank test (NPCs versus ES cells) are shown (**** P < 0.0001; 
***P < 0.0005; **P < 0.005; *P < 0.05; no annotation = not significant). 

c, Levels of H3K9me3 enrichment in control and ESET-knockout ES cells (top) 
or control and SUV39h1/2-knockout ES cells” (bottom) at various repeat 
classes. DKO, double knockout. Data are represented as in a and b. 
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Extended Data Figure 7 | Contribution of DAXX, ATRX, KAP1 and ESET 
to H3.3 enrichment at ERVs. Related to Fig. 2. a-d, ChIP-qPCR analysis of 
H3.3 enrichment at various repeat regions in control and ATRX-knockout 

(a), DAXX-knockout (b), KAP1-knockout (c) and ESET-knockout (d) ES cells. 
Error bars represent s.d. from one experiment (n = 3). Data are representative 
of two independent ChIP experiments. e, ChIP-seq enrichment of KAP1 and 
H3.3 in control and KAP1-knockout ES cells at repetitive regions of the mouse 


genome. Data are represented in a heat map of log, fold enrichment (red) or 
depletion (blue) over a matched input. f, Loss of H3.3 reduces nuclear DAXX 
levels. Immunoblot from whole-cell extracts (WCE) or nuclear extracts (NE) in 
the presence and absence of H3.3. Asterisk denotes cross-reacting band. 

g, ChIP-seq enrichment of KAP1 and DAXX in control and H3.3-knockout ES 
cells. Data are represented as in e. Note the different colour scale used for KAP1 
and DAXX. 
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Extended Data Figure 8 | Effects of H3.3 and corepressor complex depletion —_ knockout ES cell. Five-kilobase intervals around peak centres are shown. 

on H3K9me3 heterochromatin. Related to Fig. 3. a, Positive correlation of | Colour intensity represents normalized and globally scaled tag counts. d, Levels 
H3.3 and H3K9me3 at IAP ERVs. H3.3 ChIP-seq enrichment at 800 unique of H3K9me3 at IAP, ETn, MusD ERVs and LINE elements in control and 
IAP flanking regions (see Fig. le) was binned into three groups by their KAP1-knockout ES cells (top) and control and H3.3-knockout ES cells 
H3K9me3 ChIP-seq enrichment (low, medium and high). Wilcoxon rank sum (bottom). Box plots show enrichment over matched input. e-h, ChIP-qPCR 
test (****P < 0.0001). b, Immunoblot from ES cell whole-cell lysates in the analysis of H3K9me3 at various repeat regions in control and KAP1-knockout 
presence and absence of H3.3. c, H3.3, H3K9me3 and KAP1 ChIP-seq density _(e), ESET-knockout (f), ATRX-knockout (g) and DAXX-knockout (h) ES cells. 
heat maps for peaks classified as H3.3 only (n = 60,925), both H3.3 and Error bars represent s.d. from one experiment (n = 3). Data are representative 
H3K9me3 (n = 18,605), or H3K9me3 only (n = 54,204) in control and H3.3- _ of two independent ChIP experiments. 
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Extended Data Figure 9 | Global effects of H3.3 depletion. Related to Fig. 3. 
a, H3.3 transcript levels in control, H3.3-knockdown and H3.3-knockout ES 
cells. Data are represented as mean expression relative to Gapdh + s.d. (n = 3). 
b, Relative gain/loss upon H3.3 knockdown of H3K9me3, H3.3 and total H3 are 
shown over a section of chromosome 10 containing the highly transcribed 
Rps12 gene and several ERVs. Gain/loss tracks are calculated by subtracting the 
respective control from H3.3 KD1 tracks, both standardized to a global mean of 
1. Note that H3.3 ChIP-seq data in KD1 cells represents the remaining 10% 
H3.3. The global loss of H3.3 is not directly apparent from the track due to the 
necessary normalization of the data. The H3.3 difference track thus does not 
indicate the global loss of H3.3 but merely represent the relative redistribution 
of the remaining H3.3 from active genes (Rps12) towards repetitive sequences. 
c, Levels of H3.3 and H3 and IAP, ETn, MusD, and the TSS of highly expressed 
genes in control, H3.3-knockdown and H3.3-knockout ES cells. Box plots show 


enrichment over matched input. d, Incorporation of exogenous, constitutively 
expressed, H3.3 and H3.2 added back into H3.3-knockdown or H3.3-knockout 
ES cells. H3.2 cannot substitute for H3.3 at repetitive ERVs but is efficiently 
incorporated at sites of active transcription. ChIP-seq was performed on 
lentivirally integrated H3.3-haemagglutinin (HA) and H3.2-HA in H3.3 KD1 
and H3.3 KO1. e, ChIP-seq density heat maps for peaks classified as enriched 
with both H3.3 and H3K9me3 (n = 18,605) or H3.3 only (n = 60,925). Colour 
intensity represents tag counts scaled and normalized globally. Five-kilobase 
intervals around peak centres are shown. f, Quantification of H3.3-HA and 
H3.2-HA add-back in H3.3-knockout enrichment at low and highly expressed 
genes, as well as the TSS (+1 kb) of the latter, IAP, ETn and MusD ERVs, and 
full-length LINE elements and their 5’ promoter regions. Data are represented 
as enrichment over input. 
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Extended Data Figure 10 | ERV reactivation upregulates adjacent genes and 
may be linked to unbalanced chromosomal translocations. Related to Fig. 4. 
a, Repetitive elements associated with genes in Fig. 4b. Elements that were 
found either within or nearby the transcription unit are listed and the closest 
distance of an ERV to an exon is given (accounting for the possibility that ERVs 
could initiate a partial transcript from an alternative start site). b, Newly 
annotated sites of IAP integration in wild-type and H3.3 KO] are indicated on 
karyogram. c, Karyotype analysis of wild-type and H3.3-knockout ES cells. 
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Abnormal karyotype is indicated by arrows. All analysed cells in H3.3 KO1 had 
a small reciprocal translocation between chromosomes 2q and 6q and an 
unbalanced translocation between chromosomes 6 and 17 resulting in partial 
gain of chromosomal segment 6qD to 6qG and partial loss of chromosomal 
segment 17qE2 to 17qE5. Approximately 45% of the cells had chromosomal 
breaks or gaps (1-2 per cell). Approximately 45% of the H3.3 KO2 ES cells hada 
duplication of the segment 8qC to 8qD resulting in partial gain of this segment. 


©2015 Macmillan Publishers Limited. All rights reserved 


ADAPTED FROM STEVEGRAHAM/GETTY 


CAREERS 


CREATIVE FUNDING Researcher sells discount 
cards to finance lab p.247 


UNCONVENTIONAL PATHS Tales of an academic 
entrepreneur go.nature.com/iehyjn 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


A strategic move 


Short-term upheaval can yield widespread collaborations 


and long-term resources. 
BY JULIE GOULD 


moved countries six times in the past 
dozen years. Each move represented the 
next step in his academic career: he stacked 
up degrees in his native Spain, the United 


Bese Roberto Salguero-Gomez has 


Kingdom and the United States before gaining 
his first postdoctoral position in Germany 
and then moving to his current postdoc job 
at the University of Queensland in Brisbane, 
Australia. Along the way, he put in stints as a 
research assistant in Austria and Spain. 

From the moment he arrived at his latest 
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post, Salguero-Gomez devoted most of his time 
to writing proposals to extend his stay in the 
country, and so recoup the effort of his move. 
His persistence paid off with a three-year fel- 
lowship from the Australian Research Council. 
Although he still has more than 18 months 
before his contract runs out, he is intent on 
pursuing a position in Europe next. “In the last 
year, I have applied for 22 jobs across the world, 
most of which I have not heard back from,” he 
says. Such globe-spanning searches, he says, are 
the nature of scientific research. 

Indeed, mobility is the reality for many early- 
career scientists. In 2012, Nature conducted 
an international poll on attitudes towards 
researcher mobility. It found that those who 
had received their PhDs recently (in the past 2 
years) were particularly open to moving inter- 
nationally; only 10% said that they were “not 
interested’, compared with 40% of those who 
earned their doctorate at least 16 years ago. 
Recent graduates were also more likely to be 
living outside their country of upbringing (see 
Nature 490, 326-329; 2012). 

Salguero-Gomez criss-crossed the planet 
because he thought it would make him more 
successful. Data largely support that stance. In 
2012, Paula Stephan, an economist at Georgia 
State University in Atlanta, and her colleagues 
published a study of scientists from 16 coun- 
tries. The survey, termed GlobSci, found that 
those who left their country of origin outper- 
formed non-mobile scientists, as measured by 
the impact factor of journals that included their 
work (C. Franzoni, G. Scellato and P. Stephan 
Nature Biotechnol. 30, 1250-1253; 2012). 

But not all moves bring equal benefits, says 
Stephan. Her work suggests that the biggest 
career boosts come from relocating for a post- 
doc position or to gain a specific skill. And, not 
surprisingly, those who have a position lined 
up will find getting a work permit much easier, 
adds Rachel Banks, director of public policy at 
NAFSA: Association of International Educators, 
headquartered in Washington DC. 

In his quest to determine whether his mobile 
lifestyle was beneficial for his career, Salguero- 
Gomez read many opinion pieces, but he strug- 
gled to find concrete data on the subject. He 
has since developed his own survey to explore 
whether mobility confers greater productivity 
(calculated by the number of one’s publications) 
and what cost it imposes on happiness (meas- 
ured by perceptions of work-life balance). 

His preliminary results, he says, show that 
researchers who travelled during the mas- 
ter’s and PhD stages of their careers have > 
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> published fewer papers compared with their 
non-mobile counterparts, and that those who 
moved during a first and second postdoc stint 
have published more. 


ECONOMIC ADVANTAGE 
The clearest-cut cases of mobility benefits 
involve scientists who relocate from economi- 
cally disadvantaged nations. In 2001, Fernando 
Colchero moved from Mexico to the United 
States to start his PhD at Columbia University in 
New York City. It turned out to bea fairly bumpy 
ride — a year into his programme, he found 
himself relocating again to follow his supervisor 
to Duke University in Durham, North Carolina. 
Soon after, a collaboration he had been develop- 
ing for his PhD research on tigers in India fell 
through, and he switched supervisors entirely. 
He finally finished his PhD in 2008, two years 
later than he had hoped. The setbacks were frus- 
trating, but Colchero reckons that overall, he 
still came out ahead. “The direction my career 
took was only possible because I left Mexico.” 
In 2008, he and his wife accepted postdoc 
positions at the Max Planck Institute for 
Demographic Research in Rostock, Germany. 
This move, he says, was less challenging than his 
first — he was used to acclimatizing to a differ- 
ent culture and had no teaching responsibilities; 


SURVIVAL STRATEGIES 


How to cope with 
relocation 


No matter where a researcher ends up, 
there are a few strategies that can help to 
settle into an unfamiliar environment. 

@ Follow the subject. Biologist Sara 
Sandin moved to Nanyang Technological 
University in Singapore because the 
research field in which she specialized 
was getting a lot of funding there. “Think 
about what you want to do your research 
in first,’ she says, “then find out where 
the best people in that field are — and 
speak to them.” 

@ Get a job first. If you have the 
acceptance letter before going through 
work-permit and visa applications, the 
processes will be much smoother. 

@ Keep home networks. You never know 
where your career path might lead you, 
and you could end up on a path that 
returns you to your country of origin. Stay 
in touch with previous supervisors and 
colleagues so that you maintain options 
that could bring future jobs with them. 

@ Plan ahead. If you know what field you 
want to work in, do some research to 
find out where the best opportunities for 
work are. If they are in a country where 
the language is unfamiliar, take time to 
learn it in advance. J.G. 
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Julia Barthold conducts field work on a nest-box study at the University of Southern Denmark in Odense. 


asa result, he was able to focus on experiments, 
and his research output soared. His next 
move took him to the University of Southern 
Denmark in Odense, where he now works as 
an assistant professor, and he hopes to attain a 
tenure-track position soon. Although he would 
like to remain in Denmark, he is applying to 
positions outside the country, just in case. 


NETWORK CONNECTIONS 

PhD students and postdocs generally move 
abroad in search of opportunities, but devel- 
oping networks in unfamiliar countries takes 
time and energy. In 2012, economist Guiseppe 
Scellato at the Polytechnic University of Turin in 
Italy worked with his colleagues to build on data 
from their initial GlobSci survey. They showed 
that mobile scientists develop broad networks 
that span continents and age groups, which 
could ultimately benefit their careers (G. Scel- 
lato, C. Franzoni and P. Stephan Natl Bur. Econ. 
Res. Working Pap. Ser. 18613; 2012). 

Duan Biggs, a postdoc in ecology and 
environmental science at the University of 
Queensland, spent a year doing research in 
Chile so that he could learn Spanish. Like 
Colchero, Biggs faced some initial setbacks. 
Thanks to commmunication snags and issues 
getting in touch with the right people, it took 
hima long time to get access to his lab after typi- 
cal work hours, and language-barrier problems 
prolonged his apartment search for two months. 
“But looking back, moving abroad was a really 
fantastic decision,” he says. “My scientific capac- 
ity and career really took off? He is forming 
collaborations with researchers across South 
America, his native South Africa and Australia. 

Many scientists go through an unstable 
period as they search for a home, adapt to a 
culture and, sometimes, learn another lan- 
guage. Just the prospect of shifting countries 
can be unsettling. “The toughest thing about 
moving abroad is making the decision,” says 
Owen Jones, an evolutionary biologist at the 
University of Southern Denmark, who held 
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three consecutive UK postdoc positions from 
March 2005 to December 2009. But moving 
away from one social network and forming 
another has benefits, he says. “Because I’ve also 
changed research fields a lot, I have a very broad 
network, which means forming collaborations 
is much easier” 

Being able to adapt to an unfamiliar situation 
is a key survival skill for the mobile academic 
(see ‘How to cope with relocation). “Some days 
willbe tough and you won't be happy about your 
decision to move,’ says Jones. “You will probably 
have misconceptions about what things will be 
like.” He experienced this when he moved to 
Germany after being told he could get along 
fine speaking only English. That might have 
been true ina big city such as Berlin, he says, but 
when he moved to the small town of Rostock, he 
was forced to learn German. 

Jones’s partner, Julia Barthold, a recent PhD 
graduate who models animal populations, has 
moved several times for scientific posts. A part- 
time research assistant at Max Planck, Rostock, 
she is seeking work in the private sector and the 
couple are considering their next move. They 
have also postponed having children. “Moving 
to Denmark with Owen was strategic: they've 
got a great social system that supports young 
families,” she says. But the pair recently learned 
that Owen's position is not as permanent as 
they thought, and they are weighing the per- 
sonal costs of another move against the profes- 
sional benefits. “We've become a nomadic team 
— entirely self-reliant and self-contained,’ she 
says. “It’s difficult to build networks of friends 
and maintain them if youre moving every year’ 


HOME FIRES BURNING 

Claudio Quilodran is maintaining his connec- 
tions in Chile while he works towards his PhD 
at the University of Geneva in Switzerland. He 
is carrying out his ideal project: investigating the 
biodiversity lost as a result of invasive species 
and cross-species breeding. But he knows that 
his time abroad is limited. A condition of his 


OWEN JONES 


LEHIGH UNIVERSITY 


scholarship from Chile is that he move back 
within two years of finishing his PhD — and 
he wants to make sure his South American 
connections stay strong. “Every time I go 
home to visit family, I also make sure I visit 
my old MSc professor,’ he says. 

Roving scientists may really reap their 
rewards when they come home, says Stephan 
—as can the country to which they return. 
“Returnees are likely to continue to collabo- 
rate with scientists in the country where 
they trained and thus provide a means 
of diffusion of knowledge in the home 
country,’ she says. They also train new gen- 
erations of scientists, passing on knowledge 
gained from different countries and cultures. 

That rings true for Mehmet Somel. 
He moved from Ankara, Turkey, to the 
University of Leipzig, Germany, for his PhD, 
and then to postdoc positions at the CAS- 
MPG Partner Institute for Computational 
Biology in Shanghai, China, and the Uni- 
versity of California, Berkeley. Now, he has 
returned to Ankara as an evolutionary biolo- 
gist at the Middle East Technical University, 
from which he earned his undergraduate 
and master’s degrees. “Biology in Turkey 
is relatively underdeveloped compared to 
other disciplines, especially evolutionary 
genetics,” he says. “I wouldn't have been able 
to get the training and tools I needed to con- 
tribute without going abroad” 

The benefits of working abroad in three 
highly diverse cultures continue to accrue. 
“Tam still collaborating with nearly all the 
people that I worked with,” Somel says. Not 
only that, but he had the opportunity to see 
how different laboratories were managed. “I 
could take away what I learned from each 
one and apply them to my lab in Ankara.” 

But not everyone returns. Colchero opted 
to move from Mexico to the United States 
because that was where he could pursue the 
studies that most interested him. He has 
considered moving back to Mexico at sev- 
eral points in his career, but it is looking less 
likely, he says. “The economic climate has 
made it almost impossible to get a job in aca- 
demia. So we've decided not to return” His 
is acommon tale: a 2011 study found that 
although one in eight of the world’s most 
highly cited scientists from 1981 to 2003 
were born in developing countries, 80% of 
this fraction worked in developed countries, 
mostly the United States (B. A. Weinberg 
J. Dev. Econ. 95, 95-104; 2010). 

Every researcher who relocates recounts 
a different experience, and the choice of 
whether to move comes down to weighing 
the odds. For Biggs, the pluses win out: being 
mobile as a researcher might affect one’s 
productivity in the short term, he says, “but 
when youre looking longer term, you know 
that it will benefit you in the end”. m 


Julie Gould is the editor of Naturejobs. 


TURNING POINT 


CAREERS 


Kai Landskron 


Like many researchers, chemist Kai Landskron 
struggles to piece together enough funding to 
keep graduate students in his lab at Lehigh 
University in Bethlehem, Pennsylvania. 

In March, he started an unconventional 
crowdfunding campaign — selling discount 
cards on his lab webpage that are valid at more 
than 100,000 restaurants, cinemas and shops. 


How would you describe the current US 
funding climate? 

Securing funding for research is the most 
difficult — and most frustrating — part ofa job 
that I otherwise love. The funding climate is 
very bad. I think that the effort needed to obtain 
research funding is no longer proportionate to 
the money that you get, and I believe that this 
is true for many people. I have enough funding 
from the US National Science Foundation, the 
US Department of Energy and Lehigh Univer- 
sity to work until the end of 2018. It is enough 
to support five postdocs and graduate students 
— the size of my lab for the past four years. 


What projects will donors be supporting? 

My group is developing nanoporous materials 
for use in greenhouse-gas reduction, catalytic 
converters, air and water purification and 
energy storage. 


How much of your time do you spend writing 
grant applications? 

More than half. I have applied for five to ten 
grants each year, including federal, state and 
private-foundation grants — basically, any 
opportunity that presents itself. I have more 
publications than one might expect relative to 
the number of personnel and dollars I have. 


How did you decide to sell discount cards? 

I wanted to pursue a crowdfunding model, 
but I did not want to use a platform such 
as Kickstarter. Instead, I wanted to use 
my own university website as the crowd- 
funding platform. 


Why? 

Kickstarter allows technology projects if they 
create a consumer product, but that is not a 
typical outcome of basic scientific research. And 
often, crowdfunding campaigns offer perks for 
different levels of donations. In researching my 
options, I found these discount cards, which are 
valued at US$10. But, depending on how often 
one uses them, they can actually save the card- 
holder more than $10, and they ship easily and 
inexpensively. Supporters can also give a chari- 
table donation. Often, funding organizations 
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and the public seem to expect science to refund 
society immediately. That is difficult to achieve, 
because scientific research is a long-term 
endeavour. But with this card, I can return the 
value to society — or at least to the donor. 


How many have you sold so far? 
Not very many. Fewer than 50. But I have had 
a few people also give donations. 


What is the campaign’s biggest challenge? 
Getting the word out. Iam hopeful that talking 
to the press will work as an advertisement that 
could spread further through social media. 


What do your colleagues think of the idea? 

The feedback has been positive — basically, 
people are saying that I’m showing ingenuity. 
One colleague in my department has changed 
his website to be able to receive donations, too. 


Would you contemplate moving to another 
country, where funding might be better? 

That is a complicated question in several ways. 
Iam settled here for personal and professional 
reasons. And leaving is a complex decision 
that would involve more than solely funding 
concerns. I have colleagues in other countries 
who say it is not easy to find funding where 
they are either. But if funding in another coun- 
try were dramatically improved, for example, I 
would consider it. 


What is the outlook for your future funding? 

To me, the tunnel seems to be getting darker 
rather than brighter. That is something that 
scares me. I’ve got 25 years ahead of me. If it gets 
even worse than now, that is a scary prospect. m 


INTERVIEW BY VIRGINIA GEWIN 


This interview has been edited for length and clarity. 


11 JUNE 2015 | VOL 522 | NATURE | 247 


Ua SCIENCE FICTION 


BY GEORGE ZEBROWSKI AND 
CHARLES PELLEGRINO 


cc he box flickers from abso- 
Tie zero to normal, back 
and forth,’ Cora said. 

“It’s only the instruments,” her 
thesis adviser said, and laughed. 
“Weld all be dead if they were measur- 
ing anything outside themselves.” 

“Not if it’s far enough below the jiffy 
point,” she said. 

“Is this April first?” he asked, chuck- 
ling, well aware that they were looking 
far below the travel time of light across 
the diameter of a proton, below one bil- 
lion-trillionth of a second, below an instant 
called a jiffy. His neck tingled as he gazed 
into the black box. 

“The meter flickers at the lowest limit of 
detection,” Cora said as Professor Draud 
glimpsed the jitter. It had to be one of those 
irreproducible results, seen by one observer 
and not another, probably not there at all. 
Near the theoried bottom of time-framing, 
with a jiffy measuring the travel time of light 
across a proton, and if one imagined the pro- 
ton as the diameter of Earth, one would notice 
a passage of time smaller than light’s sprint 
across the eye ofa flea on a jiffy-sized Earth. 

“There!” Cora shouted. “Duration is up.” 

“Tfit’s really there,’ he said, as if that would 
stop the absurdity. 

Now you see it, now you don't, went on for 
an hour. Their necks tingled in unison and 
Cora imagined that all the errant twinges 
she had ever felt in her body had registered 
cosmic rays, neutrinos, irrational stray 
thoughts from previous boyfriends, tachy- 
ons allegedly travelling back in time, or 
even her cat’s psychotic probings. But this 
ambiguous nonsense about absolute zero 
spilling out of a black box from the realm 
of manifold space-time was a perversity 
worthy of money-hungry sci-fi writers. 

“Can our equipment be measuring this?” 
he asked. 

“Well, it’s doing it,” she replied, enhancing 
the resolution. 

Unlike most of his colleagues, Professor 
Draud encouraged students to lead him, if 
and when they could. 

The view through the black box was 

impossible. “Two 


D> NATURE.COM flea eyes at least, and 
Follow Futures: growing,” he mut- 
W @NatureFutures tered, no longer in 
Ei go.nature.com/mtoodm denial. 
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JIFFY 


Nothing will come of nothing. 


“Still not large enough or fast enough to 
be noticed by the rest of the Universe,’ Cora 
added. 

“But ifit keeps expanding?” 

“If it's everywhere at once, then there are 
three possible outcomes — all different lev- 
els of bad” 

“And least bad?” he asked. 

“Another cosmic crunch, and maybe 
another inflationary universe afterwards.” 

Was that really it? The true default con- 
dition outside the multiverse? Untypically, 
Draud seized on the idea of an exterior 
infinity. Jean-Paul Sartre's nothingness was 
leaking into being. Outside the multiverses 
that bubbled into existence, the norm was 
absolute zero, which the bubbles resisted. 

Cora interrupted his wild thoughts. 
“When I began this, I was a little afraid, cer- 
tain that I would get no interesting results.” 

He cringed inwardly at the strangeness 
on the bench before them. “Three flea-eye 
diameters now,’ he said. 

They set up a fresh box with two monitors, 
like gamblers hoping that a new deck of cards 
would defeat the cheater in the game. But it 
all came back with the same vision of the 
entire brane of existence coming to a particle 
stop, all the granularity of the periodic table 
suddenly stilled, leaving a timeless existent. 
They stared as if blinded, expecting it all to 
wink out in a grotesque spasm of decay. 

“Can it be happening everywhere 
throughout the Universe?” Draud asked, 
eager to deny that much. Nothing in physics 
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had ever been local. Even in the quan- 
tum realm of imagined fleas on a 
proton as large as Earth, the laws 
of physics were not suggestions, but 
obligatory. 

“From the very first jiffy of the 
entire visible Universe, it’s all been 
running down from energy into 

matter, and towards the zero point...” 
Draud said, remembering an errant 
wish to have fathered a child. 

“Six flea eyes!” he announced, suddenly 
seeing himself as a fearful inflation in an 

infinite series of cosmic doomsdays and 
rebirths. If the anomaly lurched beyond 
a thousand flea-leaps, quarks and gluons 
would phase into instantaneous non-exist- 
ence, with every living creature throughout 
all the galaxies suddenly gone, all choice lost 
in a death unlike any that had ever been and 
could only be imagined, yet as simple as an 
on/off switch. No rending and tearing of 
flesh, but the truest of evils, nothingness itself, 
to which all lesser evils were beholden, to be 
feared but never experienced. 

Student Draud had sometimes longed for 
a path along which science never became his 
life. What joyful memories would comfort 
him now? 

“Tt’s still increasing!” he shouted. How 
many times had his life gone wrong, how 
often right? 

Did having progeny have anything to do 
with fathering universes? Were there intel- 
ligences who could do that? 

As Cora stared at him, the black behind 
his eyes fragmented her loveliness into 
chaos, mocking the logic that insisted on the 
impossibility, of non-existence, the much 
denied zero-point field... 

They teetered between everything and 
nothing, an ancient undefeated void at their 
throats, and he felt the onset of the distin- 
guished moment, so much greater than the 
usual dying, unequalled except by the lost 
strangeness of birth. = 


George Zebrowski is the Campbell Award 
winning novelist of Brute Orbits and the 
classic Macrolife. All his fiction is available 
from Gollancz’ SF Gateway (www.sfgateway. 
com) and Open Road (www.openroadmedia. 
com). Charles Pellegrino’ many books of 
history and archaeology include Her Name 
Titanic and To Hell and Back: The Last 
Train From Hiroshima. His science-fiction 
novels include Dust and The Killing Star 
(with George Zebrowski). 
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